While AI systems amaze and alarm the world in equal measure, they’re about to get even more powerful. Nvidia has announced a new class of supercomputer that will train the next generation of AI models, and put us all out of work far faster.
The new system is known as the Nvidia DGX GH200, and it will apparently be capable of a massive 1 exaflop of performance. Between the 256 GH200 “superchips” it’s made of, the system will pack an astonishing 144 TB of shared memory, which is 500 times more than Nvidia’s previous supercomputer, the DGX A100, unveiled just three years ago.
To wring out every last drop of power, each GH200 superchip is made up of the company’s Grace CPU and H100 Tensor Core GPU in one package, letting them communicate with each other seven times faster than a PCIe connection and using just one-fifth of the electricity. They’ll all be connected together through the Nvidia NVLink Switch System, to function together as one big GPU.
The resulting supercomputer will be used to train the successors to ChatGPT and other generative AI and large language models. That most famous of AI systems was trained on a custom supercomputer that Microsoft built out of tens of thousands of Nvidia’s earlier A100 GPUs. The company is once again among the first in line for the new gear, along with Meta and Google Cloud.
Nvidia isn’t just supplying other companies with equipment though – it’s also announced plans to build its own DGX GH200-based supercomputer named Helios. Expected to fire up by the end of 2023, Helios will be made up of four DGX GH200 systems, or 1,024 GH200 superchips, networked together. That would make it capable of a total of 4 exaflops of performance, which sounds like an eye-watering amount of power.
But of course, there’s a caveat to those numbers. Currently, the most powerful supercomputer in the world is the US DOE’s Frontier, at 1.194 exaflops, and at a glance it may sound like Nvidia’s Helios will be four times more powerful – but it’s comparing apples and oranges. Nvidia is using a less precise measure called FP8, while supercomputers are generally ranked using double-precision FP64. If converted into FP64, Helios would be cracking about 36 petaflops, or 0.036 exaflops.
That said, Helios and the DGX GH200 superchips that it’s based on are still incredibly powerful tools, and they’ll be able to churn out AI models within weeks instead of months, Nvidia says. We mere humans better polish up our resumés.