If you thought Intel's plans to embed eight cores in its high-end processors were a bit too out there, you'll find that the latest processor developed by semiconductor start-up Tilera is even more of an extreme. Packing 100 1.25GHz to 1.5GHz cores on a single chip, the Gx100 brings parallel processing to the extreme thanks to a new architecture that minimizes the bus bottleneck in today's multi-core processors.

Chip makers are constantly pushing for faster processors, but clock speeds can only be pushed so far. As a result, the semiconductor industry has opted to pack multiple processors on a single chip and dividing the workload in equal parts whenever this is possible — that's the gist of parallel computing.

Graphics processing units are a common example of a "multi-core" chip that can process hundreds of independent data streams in parallel: each stream is processed separately and the result is then output on the screen. Programmers are starting to harness the parallel data processing capabilities of GPUs, and by doing so they can often speed up their data crunching by tens or even hundreds of times.

However GPUs, especially the older models, have limited flexibility and are built for speed, not precision (if a single pixel is a little bit off color, people won't usually notice). Even though GPU makers are developing better architectures for using them as data crunchers, they still remain far from a general-purpose processor, making them able to speed up only certain tasks.

This is when the Tile Gx100 comes in. The Gx100, as Tilera chief technical officer Anant Agarwal explained, is a chip that can run off-the-shelf programs almost unmodified, offering at least four times the compute performance of an Intel Nehalem-Ex while burning a third of the power. In other terms, it makes GPU-like massive parallel processing available on a general-purpose chip.

Because the chip is general-purpose, programmers can recompile and run applications designed for Intel's x86 architecture on Tilera's processor without the need for further adaptation.

The key idea behind the design of the chip was a simplified architecture that eliminates the on-chip bus interconnect through which information must flow in most multi-core CPUs. This was replaced with a mesh network architecture, which involves placing a communication switch on each processor and arranging them in a grid-like fashion.

The bus architecture was a performance bottleneck that reduced the amount of data that could travel from the various cores and forced engineers to limit the number of cores on each chip. But because of this new architecture, Tilera says it can cram in as many as 100 cores on a processor without running into the bus-bandwidth limit.

The 100-core processor, fabricated using 40-nanometer technology, is expected to be available early next year, but won't be supported by Windows 7. For that, consumers will have to wait for Intel's 80-core version, which the IT giant promised to deliver to consumers sometime during the next five years.