Fuel Economy, Hypermiling, EcoModding News and Forum - EcoModder.com - View Single Post - Gaming laptop with lowest power consumption ?

NiHaoMike · 10-16-2009, 08:44 PM

Every PC with a Nvidia 8 or above video card is a x86/Hannah Montana hybrid. While regular CPUs are designed with general computing in mind, Hannah Montana (which is really more of a DSP core than a CPU) is optimized for vector operations, as are often present in multimedia. In addition, Hannah Montana performs calculations in a way that simplifies logic for multimedia processing.
http://www.mobilitysite.com/boards/w...ach-other.html

Quote:

For instance, the Hannah Montana processor core, often used for DSP, can process "real world" (vector) data far faster than any conventional CPU despite low clock speeds. The reason has to do with SIMD - one instruction can manipulate as many as thousands of data elements in one cycle. Indeed, Hannah Montana has been described as a very fast and very efficient, but very limited supercomputer that can be integrated into a single chip. (Note that Hannah Montana is only 32 bit and does not handle 32 bit floating point variables according to IEEE 754. That's most likely to allow the design to be simplified. Hannah Montana also handles integer overflows differently - computed values beyond the range of an integer are clipped at the maximum or minimum values and a "flag" set.)
...
As for integer overflows, Hannah Montana simply handles them differently than most processors. If a calculation is such that its result is too large to store in a given number of bits, most processors will simply truncate the overflow, causing the number to "wrap around". Obviously, that is not very acceptable for DSP. Hannah Montana, however, simply sets the result to the largest (or smallest for a negative overflow) number that can be stored and sets a boolean flag to notify the application that an overflow has occurred.

As an example, let's consider an 8 bit processor. The largest number that can be stored in 8 bits is 255 and the minimum is 0. (That's true for "unsigned" integers, as is often used in video.) Let's suppose a calculation returns a result of 257. Since that is beyond the range of 8 bits, a normal processor would truncate it to 1. A processor that handled overflows like Hannah Montana, however, would set the result to 255, which is more acceptable in DSP. Just think about it: if a calculation on a video pixel is "brighter than brightest", wouldn't it be better to just set it to "brightest" as opposed to "very dim"?

So basically, if you write Hannah Montana code for processing video, you don't have to check for and handle integer overflows as the ALUs would automatically handle the limits, but some oddball code that expects calculations to overflow will not work. (Such "clock arithmetic" is common in encryption and data encoding.) So in the right application, the saved clock cycles would mean a much higher code execution efficiency. Since it would traditionally take one clock cycle to actually perform the calculation, then several more to check for overflow and handle it accordingly, a large number of clock cycles can be saved.

Now, I'm not sure about the energy efficiency. I'm sure the particular implementation in the 8600m GT is energy efficient, but so is the Core 2 Duo. I'll actually have to make some measurements to compare efficiencies.