We've seen vertically oriented transistors, now it's time for entire chips to explore the z-axis. Collaborating with Swiss research institutes EPFL and ETH Zurich, IBM has made another important step toward creating faster, higher-efficiency "3D" processors stacking their cores vertically to increase the number of interconnections and sensibly reduce heat.
From single-core to multicore
You may argue that writing a text document or viewing your holiday pictures are very simple tasks that don't require much number crunching at all, but the fact is that periodically updated, stable and secure operating systems with pretty-looking graphics invariably require more and more processing power.
So, when a single core just isn't enough to write that novel you've been working on, multicore designs come into play. But why can't we just keep using faster and faster single-cores? Power consumption is often the only reason cited to motivate the transition, but there are at least another two that deserve to be mentioned.
Instruction-level parallelism refers to how well a complex operation can be subdivided into simpler steps that can be computed in parallel, then recombining the partial results for a performance increase that is a factor of how many different "processing streams" were created. Using multiple cores improves parallelism both at the instruction level and, taking a few steps back, at the process level, improving the system's responsiveness.
Memory usage is another big issue. Memory clock speeds don't increase as quickly as CPU clock speeds and this means that, to keep pace, larger and larger caches should be used. Given their cost, this is quite obviously an unpractical road. The lower clock speeds in multicore processors is one of the factors that help achieve a more efficient use of memory.
Finally, power efficiency is also extremely important. A presentation from Intel (PDF) offers a very fitting example. Suppose we have a single-core processor running at its normal frequency: we may use this as a benchmark and say that it yields one unit of performance per unit of power drawn. If we try to overclock it, which is what hardcore gamers and other demanding professionals often do, we typically obtain a 13 percent increase in performance, but we pay for it with a (typically) 73 percent increase in power, which makes overclocking generally a bad idea.
What we can do instead is experiment by going against the current and actually underclock the processor. If we are willing to accept a specular 13 percent decrease in performance down to 0.87 "performance units," we find that now only 0.51 units of power are required: our CPU suddenly becomes much more power-efficient — about 1.73 units of performance per unit of power, a 73 percent increase with respect to our benchmark. To obtain more processing power, we simply connect several underclocked cores together: the speeds won't simply add up — a dual-core 1.6GHz processor is a bit slower than a single-core 3.2GHz — but we'll preserve our power efficiency.
From multicore to 3D
Unfortunately, multicore processors have their issues too. The cores can't be placed very close together without showing heavy interference or sensibly slowing down the operations, meaning that the tiny wires the connect them to each other have to be relatively long. This poses two serious problems: first, more silicon real-estate is required, which makes costs skyrocket; and second, the time it takes for the electrical signals to propagate correctly is proportional to the square of the wire length, meaning such long wires have a heavy impact on intercommunication speeds.
An idea that starts to make sense is to simply stack integrated circuits one over the other vertically. The components can now be placed closer together, and interconnecting buses can not only be much shorter — there can be more of them too, and without seriously impacting costs.
Working on this technique, IBM and partners working at the CMOSAIC project say they can achieve 100 to 10,0000 connections per square millimeter, ensuring data transfer rates ten times higher than ever before.
Surprisingly, even with the cores packed so tightly together, three-dimensional chips actually produce less heat and consume less power, mainly because keeping a signal on-chip rather than on a wire reduces power consumption. But looking at the situation of power consumption generated by computers, it's now becoming very clear that we need even more drastic measures to reduce power consumption.
"In the United States, the industry's data centers already consume as much as 2% of available electricity. As consumption doubles over a five-year period, the supercomputers of 2100 would theoretically use up the whole of the USA's electrical supply," explained John R. Thome of the EPFL team.
For this reason, the researchers went even a step further to devise a completely new cooling system for their 3D chips. In it, channels with a 50 micron diameter (about the diameter of a human hair) are inserted between each core layer. These microchannels contain a cooling liquid, which exits the circuit in the form of vapor, is brought back to the liquid state by a condenser and then finally pumped back into the processor. Next year, a prototype of this cooling system will be implemented and tested under actual operating conditions.
The main challenge posed by the 3D chip approach is a direct consequence of the possibilities it opens up: now that interconnecting cores is cheap and easy, engineers need to find the best way to optimize the design even on (literally) a whole new dimension, and this means better and more capable CAD programs need to rise up to this difficult task. Then, there's also the question of complexity — chips that are this sophisticated need to be manufactured with absolute precision and without defects, or they could suddenly and inexplicably fail without a change to realize why.
For these and other reasons, the researchers estimate it will take a few years until 3D microchips equip consumer electronics. The initial 3D microprocessors, the researchers say, should be fitted on supercomputers by 2015, while the version with an integrated cooling system should reach the consumer market around 2020.