Jensen Huang wants to bring generative AI to every data center, the Nvidia co-founder and CEO said during Computex in Taipei today. During the speech, Huang’s first public speech in almost four years he said, he made a slew of announcements, including chip release dates, its DGX GH200 super computer and partnerships with major companies. Here’s all the news from the two-hour-long keynote.
1. Nvidia’s GForce RTX 4080 Ti GPU for gamers is now in full production and being produced in “large quantities” with partners in Taiwan.
2. Huang announced the Nvidia Avatar Cloud Engine (ACE) for Games, an customizable AI model foundry service with pre-trained models for game developers. It will give NPCs more character through AI-powered language interactions.
3. Nvidia Cuda computing model now serves four million developers and more than 3,000 applications. Cuda seen 40 million downloads, including 25 million just last year alone.
4. Full volume production of GPU server HGX H100 has begun and is being manufactured by “companies all over Taiwan,” Huang said. He added it is the world’s first computer that has a transformer engine in it.
5. Huang referred to Nvidia’s 2019 acquisition of supercomputer chipmaker Mellanox for $6.9 billion as “one of the greatest strategic decisions” it has ever made.
6. Production of the next generation of Hopper GPUs will start in August 2024, exactly two years after the first generation started manufacture.
7. Nvidia’s GH200 Grace Hopper is now in full production. The superchip boosts 4 PetaFIOPS TE, 72 Arm CPUs connected by chip-to-chip link, 96GB HBM3 and 576 GPU memory. Huang described as the world’s first accelerated computing processor that also has a giant memory: “this is a computer, not a chip.” It is designed for high-resilience data center applications.
8. If the Grace Hopper’s memory is not enough, Nvidia has the solution—the DGX GH200. It’s made by first connecting eight Grace Hoppers together with three NVLINK Switches, then connecting the pods together at 900GB together. Then finally, 32 are joined together, with another layer of switches, to connect a total of 256 Grace Hopper chips. The resulting ExaFLOPS Transformer Engine has 144 TB GPU memory and functions as a giant GPU. Huang said the Grace Hopper is so fast it can run the 5G stack in software. Google Cloud, Meta and Microsoft will be the first companies to have access to the DGX GH200 and will perform research into its capabilities.
9. Nvidia and SoftBank have entered into a partnership to introduce the Grace Hopper superchip into SoftBank’s new distributed data centers in Japan. They will be able to host generative AI and wireless applications in a multi-tenant common server platform, reducing costs and energy.
10. The SoftBank-Nvidia partnership will be based on Nvidia MGX reference architecture, which is currently being used in partnership with companies in Taiwan. It gives system manufacturers a modular reference architecture to help them build more than 100 server variations for AI, accelerated computing and omniverse uses. Companies in the partnership include ASRock Rack, Asus, Gigabyte, Pegatron, QCT and Supermicro.
11. Huang announced the Spectrum-X accelerated networking platform to increase the speed of Ethernet-based clouds. It includes the Spectrum 4 switch, which has 128 ports of 400GB per second and 51.2T per second. The switch is designed to enable a new type of Ethernet, Huang said, and was designed end-to-end to do adaptive routing, isolate performance and do in-fabric computing. It also includes the Bluefield 3 Smart Nic, which connects to the Spectrum 4 switch to perform congestion control.
12. WPP, the largest ad agency in the world, has partnered with Nvidia to develop a content engine based on Nvidia Omniverse. It will be capable of producing photos and video content to be used in advertising.
13. Robot platform Nvidia Isaac ARM is now available for anyone who wants to build robots, and is full-stack, from chips to sensors. Isaac ARM starts with a chip called Nova Orin and is the first robotics full-reference stack, said Huang.
Thanks in large to its importance in AI computing, Nvidia’s stock has soared over the past year, and it is currently has a market valuation of about $960 billion, making it one of the most valuable companies in the world (only Apple, Microsoft, Saudi Aramco, Alphabet and Amazon are ranked higher).
China business in limbo
China’s AI firms are no doubt closely watching the state-of-the-art silicon Nvidia is bringing to the table. Meanwhile, they probably dread another round of U.S. chip bans that threaten to undermine their advancement in generative AI, which requires significantly more computing power and data than previous generations of AI
The U.S. government last year restricted Nvidia from selling its A100 and H100 graphic processing units to China. Both chips are used for training large language models like OpenAI’s GPT-4. H100, its latest generation chip based on the Nvidia Hopper GPU computing architecture with its built-in Transformer Engine, is seeing particularly strong demand. Compared to A100, H100 is able to offer 9x faster AI training and up to 30x faster AI inference on LLMs.
China is obviously too big a market to miss. The chip export ban would cost Nvidia an estimated $400 million in potential sales in the third quarter of last year alone. Nvidia thus resorted to selling China a slower chip that meets U.S. export control rules. But in the long term, China will probably look for more robust alternatives, and the ban serves as a poignant reminder for China to achieve self-reliance in key tech sectors.
As Huang recently said in an interview with the Financial Times: “If [China] can’t buy from … the United States, they’ll just build it themselves. So the US has to be careful. China is a very important market for the technology industry.”