The era of ever-larger artificial intelligence models is coming to an end, according to OpenAI CEO Sam Altman, as cost constraints and diminishing returns curb the relentless scaling that has defined progress in the field.
Speaking at an MIT event last week, Altman suggested that further progress would not come from “giant, giant models.” According to a recent Wired report, he said, “I think we’re at the end of the era where it’s going to be these, like, giant, giant models. We’ll make them better in other ways.”
Though Mr. Altman did not cite it directly, one major driver of the pivot from “scaling is all you need” is the exorbitant and unsustainable expense of training and running the powerful graphics processes needed for large language models (LLMs). ChatGPT, for instance, reportedly required more than 10,000 GPUs to train, and demands even more resources to continually operate.
Nvidia dominates the GPU market, with about 88% market share, according to John Peddie Research. Nvidia’s latest H100 GPUs, designed specifically for AI and high-performance computing (HPC),can cost as much as $30,603 per unit — and even more on eBay.
Training a state-of-the-art LLM can require hundreds of millions of dollars’ worth of computing, said Ronen Dar, cofounder and chief technology officer of Run AI, a compute orchestration platform that speeds up data science initiatives by pooling GPUs.
As costs have skyrocketed while benefits have leveled off, the economics of scale have turned against ever-larger models. Progress will instead come from improving model architectures, enhancing data efficiency, and advancing algorithmic techniques beyond copy-paste scale. The era of unlimited data, computing and model size that remade AI over the past decade is finally drawing to a close.
‘Everyone and their dog is buying GPUs’
In a recent Twitter Spaces interview, Elon Musk recently confirmed that his companies Tesla and Twitter were buying thousands of GPUs to develop a new AI company that is now officially called X.ai.
“It seems like everyone and their dog is buying GPUs at this point,” Musk said. “Twitter and Tesla are certainly buying GPUs.”
Dar pointed out those GPUs may not be available on demand, however. Even for the hyperscaler cloud providers like Microsoft, Google and Amazon, it can sometimes take months — so companies are actually reserving access to GPUs. “Elon Musk will have to wait to get his 10,000 GPUs,” he said.
VentureBeat reached out to Nvidia for a comment on Elon Musk’s latest GPU purchase, but did not get a reply.
Not just about the GPUs
Not everyone agrees that a GPU crisis is at the heart of Altman’s comments. “I think it’s actually rooted in a technical observation over the past year that we may have made models larger than necessary,” said Aidan Gomez, co-founder and CEO of Cohere, which competes with OpenAI in the LLM space.
A TechCrunch article reporting on the MIT event reported that Altman sees size as a “false measurement of model quality.”
“I think there’s been way too much focus on parameter count, maybe parameter count will trend up for sure. But this reminds me a lot of the gigahertz race in chips in the 1990s and 2000s, where everybody was trying to point to a big number,” Altman said.
Still, the fact that Elon Musk just bought 10,000 data center-grade GPUs means that, for now, access to GPUs is everything. And since that access is so expensive and hard to come by, that is certainly a crisis for all but the most deep-pocketed of AI-focused companies. And even OpenAI’s pockets only go so deep. Even they, it turns out, may ultimately have to look in a new direction.