Home / Opinions / Intel Core i7 (Nehalem) Architecture Overview / Speeds and Feeds, Power Usage

Speeds and Feeds, Power Usage

Due to its modular nature, the various parts of Nehalem's die are both voltage and frequency de-linked. The memory interface, processing cores and I/O centre are all completely decoupled, such that each can tailor its working environment to best suit its own needs. Internal clock speeds are adjusted on the fly to compensate for, or take advantage of, fluctuations in input voltage.

Early on in the development cycle Intel apparently even considered discarding the idea of external clock speeds altogether, but decided against that course as customers complained. Thus, to give the appearance of a consistent CPU clock speed, internal clocks are averaged out and tied to a reference clock - 133MHz to be precise. The asynchronous sections of the CPU are all tied to this synchronous interface so that, from the outside, Nehalem CPUs will appear to have a consistent clock speed.

Power efficiency is high up on Nehalem's agenda, with idle power use being a particular strong point. Nehalem's cores can be in one of four possible power states, C0, C1, C3 and C6 and those switches happen 56 per cent faster than in a Penryn environment.

In its C0 state, the CPU is fully active, at its full power draw, C1 throttles the CPU's clock speed (and voltages) such that power draw is roughly halved, C3 turns off the PLLs and flushes the core's local cache for another 50 per cent power drop and C6 turns off pretty much everything while drawing almost no power. The trade-off is that while each sleep state is less power-hungry than the last, it also takes longer to wake up from.

As well as power gating when under no load, Nehalem has another power management trick up its sleeve called Turbo Mode. The premise is incredibly simple, but also extremely clever.

In normal operation each Nehalem core gets an equal share of the workload assigned to the CPU and each will increase or decrease its clock depending on its workload. Sometimes, though, that can lead to a fairly inefficient distribution of workload as the constant trickle of work to each core keeps the whole lot running at high clock speeds, which can result in higher than necessary power use. In Turbo Mode Nehalem is able to recognise that fact and redistribute work across its cores to make better use of the power it is drawing.

In lightly threaded situations, when cores are either being stressed lightly or some aren't being used at all, the CPU will turn off under-worked cores and redistribute their workloads across the remaining cores. To compensate for the, potentially, decreased performance as a result of giving fewer cores more work, the cores that aren't powered down will increase their clock speeds (and voltages) to provide more processing power while remaining within the CPU's overall TDP.

In effect, in an environment where there aren't enough threads to stress all of a Nehalem CPU's cores, the chip gives more performance for free. While that's pretty cool on the desktop side, it is in notebook Nehalem (that's Calpella, codename junkies) where I think the technology will really shine. And, yes, if you want to you can disable Turbo Mode.

It's not for nothing that Nehalem's Power Control Unit is about the same size as an Intel 486 processor, according to Pat Gelsinger at least.

comments powered by Disqus