Found the metric conversion bug - when I re-wrote the conversion routine, forgot to change the denominator from loading into register 3, to loading into register 1. It works now. Derp.
Modified the 64-bit left-shift and right-shift routines, and gained a 31% speed improvement, but at the cost of adding 200 bytes to the compiled output code. This is important because the 64-bit division routine performs about 100 shifts on average, for each division operation (depending on the bit patterns of both the numerator and denominator).
Substantially re-wrote the 64-bit multiplication routine. It now takes up 300 more bytes, but also nets about a 6% speed improvement. Also experimented with substantially re-writing the 64-bit addition routine, and while it netted about a 1% speed improvement, it also added 400 bytes. So, the 64-bit shift routine modifications stay, the 64-bit multiplication improvement stays, but the addition modification goes away.
Might look at re-writing the 64-bit integer to string output formatting routine, because each formatted number currently requires four 64-bit divisions, and the JSON output routine alone outputs at least 32 separate numbers.
|