Multi-Core Approaches – Qualcomm vs. Nvidia

I've recently been wondering about the different approaches taken by companies to increase the performance of the CPU part in mobile devices and decided to have a look at some whitepapers. Here's the result that you might find interesting as well:

An in increase in processing power can in a first instance be achieved by increasing the clock rate and make command execution more efficient in general. This is done by using more transistors on the chip to reduce the number of clock cycles required to execute a command and by increasing the on chip memory cache sizes to reduce the occasions the processor has to wait for data to be delivered from external and slow RAM.

Both approaches are made possible by the ever shrinking size of the transistors on the chip. While previous generations of smartphone chips used 90 nanometer structures, current high end smartphones use 45 nanometor technology and the next step to 32 and 28 nanometer structures is already in sight. When transistors get smaller, more can be fitted on the chip and power consumption at high clock rates is lowered. But there's a catch, that I'll talk about below.

Another way of increasing processing power is to have several CPUs and have the operating system assign different tasks that want to be executed simultaneous to different processor cores. When looking at Nvidia's latest Tegra design, it features 4 CPU cores so four tasks can be run in parallel. As often that is not required, the design allows to deactivate and reactivate individual cores at run-time to reduce power consumption when four cores are not necessary, which is probably most of the time. In addition, Nvidia features a 5th core that they call a "companion core" that takes over when only little processing power is needed, for example while the display is off and only low intensity background tasks have to be served. So why is a 5th core required, why can't just one of the four other cores at low clock speed take over the task. Here's were the catch comes into play that I mentioned earlier:

Total chip power consumption is governed by two influences, leakage power and dynamic power. When processors are run at high clock speeds a low voltage is required as the power requirement increases linearly with frequency but in square with the voltage. Unfortunately, optimizing the chip for low voltage operation increases the leakage power, i.e. the power consumption when voltage is applied to a transistor which always requires power to keep it's state. It is this leakage power which becomes the dominant power consumption source when the CPU is idle, i.e. when the  screen switched off, when only background tasks running, etc. And it is at this point where the Tegra's companion CPU comes in. On the die it is manufactured with a different process that is less optimized for high speeds but more optimized for low leakage power. The companion CPU can thus only be run at clock speeds up to 500 MHz but has the low power consumption advantage in idle state. Switching back and forth between the companion CPU and the four standard cores is seamless to the operating system and can be done in around 2 milliseconds.

Qualcomm has used a different approach in their latest Krait architecture to conserve power. Instead of requiring all cores to run at the same clock speed, each core can be run at a different speed depending on how much workload the operating system is requesting to the cores to be worked on. So rather than optimizing one processor for leakage power consumption, their approach to conserve power is to reduce the clock speed of individual processors when less processing power is required.

Which of the two approaches works better in practice is yet to be seen. I wouldn't be surprised though if at some point a combination of both would be used.

2 thoughts on “Multi-Core Approaches – Qualcomm vs. Nvidia”

  1. Many expected 2012 to be the year of the quad-core smartphone, but it’s much more likely that dual-core processors will still continue to show up in flagship superphones and attract a lot of customers from the substantial mid-end market.

  2. The two are already both used. Ramping up a core who’s power envelope let’s it get to 1.5Ghz, even to just 500Mhz, will still consume more power than a core than is designed to operate at a maximum of 500Mhz. The Tegra design can power gate individual cores and down clock cores. This was not highlighted as part of the Tegra design b/c there is nothing ground breaking about it. CPUs have been doing this for years. The lower power core for “background” tasks is the thing that is, currently, unique to Tegra. The key is that it supports the full ARM ISA so that software does not have to be rewritten to know it is running on a core with a different architecture. The OS does not “see” this other core.

    Anandtech has a good article on where ARM is going with this idea.

Comments are closed.