Go Back   EcoModder Forum > EcoModding > Instrumentation > OpenGauge / MPGuino FE computer
Register Now
 Register Now
 

Reply  Post New Thread
 
Submit Tools LinkBack Thread Tools
Old 09-25-2011, 04:09 AM   #1 (permalink)
EcoModding Lurker
 
Join Date: Oct 2009
Location: Austria
Posts: 28
Thanks: 0
Thanked 1 Time in 1 Post
Cool If you want to speed up calculations by 14135x..

A long time since I was here the last time.. So please apologize if there is already a discussion about this in another thread, I have no time to read them all (But some of them are realy interesting, release two workspace for example)

Back to topic. Currently I'm building a onboard computer (right word?) for my motorcycle that should fully integrate into the speedo. So no external device in my case. I'm using the Arduino pro mini and a 4 digit 8 segment LED display. Some of the basic fuel calc code is from MPGuino Project of course.
In the last days, I had a look on the MPGuino 64bit math. I know it was written because of the "old" ATMega 168 space limitations. But they are so incredible slow, and the code is very hard to read for a person that is not familiar with MPGuino code..
So I transformed the calculations to normal 64bit math (normal *, /, +, - and so on), removing the 64bitmath code. Yes, the code is about 4kB bigger now. But I wrote a small benchmark, it does 10.000 distanceperfuelunit() calculations, first with MPGuino math, then with Arduino math.

The MPGuino math needs 8707ms for this job, while the "Arduino" math needs only 616µs!! This is a 14135x speed up! I think this is worth 4kB of additional code if you are using a ATMega328 based device

If you want to test it on your own, here is the benchmark code. Just add the 64bit MPGuino code to the bottom, I removed them so the code stays clear:

Code:
unsigned long tmp1[2];
unsigned long tmp2[2];
unsigned long tmp3[2];

void setup (void){
  Serial.begin(9600);
}

unsigned long time;
unsigned long result;
unsigned long injHiSec = 10; //some value, yust to have something for calc
unsigned long injHius = 5; //some value, yust to have something for calc
unsigned long x = 7734;
unsigned long y = 261221646;
unsigned long vssPulses = 25; //some value, yust to have something for calc
void loop (void){
  Serial.print("10.000 64bit math MPGuino: ");
  time = millis();
  for (int i = 0; i < 10000; i++){ //start 10.000 caclulations. In real code, this would give you the average L/100km
  init64(tmp1,0,injHiSec);
  init64(tmp3,0,1000000);
  mul64(tmp3,tmp1);
  init64(tmp1,0,injHius);
  add64(tmp3,tmp1);
  init64(tmp1,0,x);
  mul64(tmp3,tmp1);
  init64(tmp2,0,1000);
  mul64(tmp3,tmp2);
 
  init64(tmp1,0,y);
  init64(tmp2,0,vssPulses);
  mul64(tmp1,tmp2);
 
  div64(tmp3,tmp1);
  
  init64(tmp2,0,100);
  mul64(tmp3,tmp2);
  }
  Serial.println(millis() - time);
  
  Serial.print("10.000 64bit math Arduino: ");
  time = micros();
  for (int i = 0; i < 10000; i++){ //Same code as above, but written for standard math.
  result = ((((unsigned long long)injHiSec * 1000000ull + (unsigned long long)injHius) * (unsigned long long)x * 1000ull) / ((unsigned long long)y * (unsigned long long)vssPulses)) * 100ull;
  }
  Serial.println(micros() - time);
  
  delay(10000);
}
Edit: Bug corrected.


Last edited by Sebastian; 09-25-2011 at 10:22 AM.. Reason: Comments in code added
  Reply With Quote
Alt Today
Popular topics

Other popular topics in this forum...

   
Old 10-13-2011, 09:41 AM   #2 (permalink)
dcb
needs more cowbell
 
dcb's Avatar
 
Join Date: Feb 2008
Location: ÿ
Posts: 5,038

pimp mobile - '81 suzuki gs 250 t
90 day: 96.29 mpg (US)

schnitzel - '01 Volkswagen Golf TDI
90 day: 53.56 mpg (US)
Thanks: 158
Thanked 269 Times in 212 Posts
problem wasn't the performance, problem was space. One of the avr/arduino upgrades included serious bloat on the 64 bit libraries, i.e. 5k.

Arduino Forum - unsigned 64 bit functions
__________________
WINDMILLS DO NOT WORK THAT WAY!!!
  Reply With Quote
Old 10-13-2011, 10:11 AM   #3 (permalink)
EcoModding Apprentice
 
meelis11's Avatar
 
Join Date: Feb 2009
Location: Estonia
Posts: 199

Green frog - '97 Audi A4 Avant 1.9TDI 81kW
Diesel
90 day: 43.1 mpg (US)
Thanks: 19
Thanked 40 Times in 28 Posts
Maybe this is right place to ask but I have wondered why do we need 64bit calcualtions? Is it impossible to do calculations in 32bit or is it for accuracy?
  Reply With Quote
Old 10-13-2011, 10:26 AM   #4 (permalink)
dcb
needs more cowbell
 
dcb's Avatar
 
Join Date: Feb 2008
Location: ÿ
Posts: 5,038

pimp mobile - '81 suzuki gs 250 t
90 day: 96.29 mpg (US)

schnitzel - '01 Volkswagen Golf TDI
90 day: 53.56 mpg (US)
Thanks: 158
Thanked 269 Times in 212 Posts
Actually most of the variables are 32 bit, for long term trips/etc. but they need to be promoted to 64 bit, especially for combined multiplication and division routines, so they don't lose accuracy. It is still fixed point math though.
__________________
WINDMILLS DO NOT WORK THAT WAY!!!
  Reply With Quote
Old 10-13-2011, 11:05 AM   #5 (permalink)
EcoModding Apprentice
 
meelis11's Avatar
 
Join Date: Feb 2009
Location: Estonia
Posts: 199

Green frog - '97 Audi A4 Avant 1.9TDI 81kW
Diesel
90 day: 43.1 mpg (US)
Thanks: 19
Thanked 40 Times in 28 Posts
So if multiplication does not fit into 32bit variable, it overflows and gets corrupted?
Or is it just to gain 0.00..% more accuracy
  Reply With Quote
Old 10-13-2011, 12:18 PM   #6 (permalink)
EcoModding Lurker
 
Join Date: Oct 2009
Location: Austria
Posts: 28
Thanks: 0
Thanked 1 Time in 1 Post
Quote:
problem wasn't the performance, problem was space.
Yes I know, as I wrote "I know it was written because of the "old" ATMega 168 space limitations." But it's an very easy way for Arduino users to increase performance and accuracy without changing to 20MHz.

Quote:
So if multiplication does not fit into 32bit variable, it overflows and gets corrupted?
Or is it just to gain 0.00..% more accuracy
We are multiplying very huge numbers, and an unsigned long reaches from 0 to 4,294,967,295. If you multiply (for example) 70000*70000, the result already breaks this range. So it's not just for accuracy, it will throw completely wrong values because you create an overflow!
  Reply With Quote
Old 10-13-2011, 09:13 PM   #7 (permalink)
dcb
needs more cowbell
 
dcb's Avatar
 
Join Date: Feb 2008
Location: ÿ
Posts: 5,038

pimp mobile - '81 suzuki gs 250 t
90 day: 96.29 mpg (US)

schnitzel - '01 Volkswagen Golf TDI
90 day: 53.56 mpg (US)
Thanks: 158
Thanked 269 Times in 212 Posts
Quote:
Originally Posted by Sebastian View Post
...But it's an very easy way for Arduino users to increase performance and accuracy without changing to 20MHz.
It does sound like a good idea to entertain it, but it isn't broken currently . I'm not sure if it has an effect on accuracy like the crystal, the injector interrupt is still pretty tight and its resolution is tied to the cpu clock and registers. And the cpu is not currently overloaded.

I'll run some tests over the serial port when I get a chance.
__________________
WINDMILLS DO NOT WORK THAT WAY!!!
  Reply With Quote
Old 10-29-2011, 06:49 PM   #8 (permalink)
EcoModding Apprentice
 
Join Date: Aug 2009
Location: terra firma
Posts: 138
Thanks: 4
Thanked 24 Times in 22 Posts
Quote:
Originally Posted by Sebastian View Post
We are multiplying very huge numbers, and an unsigned long reaches from 0 to 4,294,967,295. If you multiply (for example) 70000*70000, the result already breaks this range. So it's not just for accuracy, it will throw completely wrong values because you create an overflow!
The calculations can be done in 32bits, but you need to structure the operations such that they never overflow.

Take instantgph(), e.g. Here your formula is

gph= (instInjTot * 3,600,000,000 * 1000 ) / (parm[uSperGallon] * (instInjEnd - instInjStart))

I.e., you have one huge-ass, overflowing value, divided by another huge-ass, non-overflowing value. But you can break it down into non-overflowing operations, thusly:

x = ( 3,600,000,000 / (parm[uSperGallon] * (instInjEnd - instInjStart)) ) /* does not overflow */
gph = x * 1000 * instInjTot /* still does not overflow */



The same principle can be applied to each of the mpguino's calculations, to allow the complete removal of the 64bit math library. What you need, instead of blindly plugging in numbers from left-to-right, is to work out each calc with expected max values for each variable in the formula -- such as: injectors should be "On" for XXX microseconds, vssPulses should not exceed YYY, during the 0.5 second data collection window -- and then re-arrange the multiplications, divisions, additions & subtractions appropriately.

I had to do this to my 'guino, in order to squeeze extra functionality into my 168; but with the 328, i guess the code savings shouldn't matter at this point. Not only do you save ~1200 bytes by tossing the 64bit library, but you also save 100-200 bytes on each function that no longer needs 64bit math. I think any speed savings are negligible, since these functions only get called a few times per second.

If my 168 chip was socketed and not SMT, i would have simply swapped it for a 328, kept the 64bit math & spent my time on something else. It was a fun exercise, anyway.
  Reply With Quote
Old 10-30-2011, 04:53 AM   #9 (permalink)
EcoModding Lurker
 
Join Date: Oct 2009
Location: Austria
Posts: 28
Thanks: 0
Thanked 1 Time in 1 Post
Yes, I'm familiar with this kind of optimization. Some years ago I wrote I program to calculate the flight path of objects through our solar system. Compared to the numbers from this old projects, MPGuino numbers are peanuts
I already rearranged some of the calculations, but basicaly it's one of the points that will be done in future versions of my projects..

But you have to be careful, I think your example can't work this way:

x = ( 3,600,000,000 / (parm[uSperGallon] * (instInjEnd - instInjStart)) ) /* does not overflow */


parm[uSperGallon] * (instInjEnd - instInjStart) will be very big:
115384616 * (instInjEnd - instInjStart), where the result of (instInjEnd - instInjStart) may be anything between 0 and 500. Basically, here is the first mistake in MPGuino code: It's trying to divide by 0 if the cars has overrun fuel cutoff!! If you compile this and run it, some versions of the AVR toolchain will create wired numbers on your display, others show 0.
Back to toppic, lets assume 500 for the result. Then:
115384616 * 500 = 57692308000. This doesn't fit in an unsigned long. Even if you reduce 500, the division of (3,600,000,000 / result) will lead to 0, which crashes the rest of your calculation and if not, it will be very unprecise.
  Reply With Quote
Old 10-30-2011, 07:34 PM   #10 (permalink)
EcoModding Apprentice
 
Join Date: Aug 2009
Location: terra firma
Posts: 138
Thanks: 4
Thanked 24 Times in 22 Posts
Yeah, i added zerodiv checks in many places. Checking my code again, i also used MillisecondsPerGallon instead of MicrosecondsPerGallon in this function (and some others), to keep values in check.

  Reply With Quote
Reply  Post New Thread






Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.
Content Relevant URLs by vBSEO 3.5.2
All content copyright EcoModder.com