The number of GPU startups in China is extraordinary as the country tries to gain AI prowess as well as semiconductor sovereignty, according to a new report from Jon Peddie Research. In addition, the number of GPU makers grew worldwide in recent years as demand for artificial intelligence (AI), high-performance computing (HPC), and graphics processing increased at a rather unprecedented rate. When it comes to discrete graphics for PCs, AMD and Nvidia maintain lead, whereas Intel is trying to catch up.
18 GPU Developers
Tens of companies developed graphics cards and discrete graphics processors in the 1980s and the 1990s, but cut-throat competition for the highest performance in 3D games drove the vast majority of them out of business. By 2010, only AMD and Nvidia could offer competitive standalone GPUs for gaming and compute, whereas others focused either on integrated GPUs or GPU IP.
The mid-2010s found the number of China-based PC GPU developers increasing rapidly, fueled by the country’s push for tech self-sufficiency as well as the advent of AI and HPC as high-tech megatrends.
In total, there are 18 companies developing and producing GPUs, according to Jon Peddie Research. There are two companies that develop SoC-bound GPUs primarily with smartphones and notebooks in mind, there are six GPU IP providers, and there are 11 GPU developers focused on GPUs for PCs and datacenters, including AMD, Intel, and Nvidia, which design graphics cards that end up in our list of the best graphics cards.
In fact, if we added other China-based companies like Biren Technology and Tianshu Zhixin to the list, there would be even more GPU designers. However, Biren and Tianshu Zhixin are solely focused on AI and HPC for now, so JPR does not consider them GPU developers.
PC | DC | IP | SoC |
AMD | Biren | Arm | Apple |
Bolt | Tianshu Zhixin | DMP | Qualcomm |
Innosilicon | Row 3 – Cell 1 | Imagination Technology | Row 3 – Cell 3 |
Intel | Row 4 – Cell 1 | Think Silicon | Row 4 – Cell 3 |
Jingia | Row 5 – Cell 1 | Verisilicon | Row 5 – Cell 3 |
MetaX | Row 6 – Cell 1 | Xi-Silicon | Row 6 – Cell 3 |
Moore Threads | Row 7 – Cell 1 | Row 7 – Cell 2 | Row 7 – Cell 3 |
Nvidia | Row 8 – Cell 1 | Row 8 – Cell 2 | Row 8 – Cell 3 |
SiArt | Row 9 – Cell 1 | Row 9 – Cell 2 | Row 9 – Cell 3 |
Xiangdixian | Row 10 – Cell 1 | Row 10 – Cell 2 | Row 10 – Cell 3 |
Zhaoxin |
China Wants GPUs
Being the world’s second largest economy, China inevitably competes against the U.S. and other well-developed countries in terms of pretty much everything, including technology. China did a lot to lure engineers from around the world and make it worthwhile to establish various chip design startups in the country. In fact, hundreds of new IC design houses emerge in China every year. They develop all kinds of things from tiny sensors to complicated communication chips, thus enabling the country’s self-sufficiency from Western suppliers.
But to really jump on the AI and HPC bandwagon, China needs CPUs, GPUs, and special-purpose accelerators. When it comes to computing, it is impossible for Chinese companies to leave behind long-time CPU and GPU market leaders any time soon. Yet, it is arguably easier and perhaps more fruitful to develop and produce a decent GPU than try to build a competitive CPU.
“AI training was the big motivator [for Chinese GPU companies], and avoidance of Nvidia’s high prices, and (maybe mostly) China’s desire for self-sufficiency,” said Jon Peddie, the head of JPR.
GPUs are inherently parallel, which means there are loads of compute units inside that can be used for redundancy, which makes it easier to get a GPU up and running (assuming that per transistor costs are relatively low and overall yields are decent). Also, since GPUs are fundamentally parallel, it is easier to parallelize them in scale-out manner. Keeping in mind that China-based SMIC does not have production nodes as advanced as those of TSMC, this way of performance scaling looks good enough. In fact, even if Chinese GPU developers lose access to TSMC’s advanced nodes (N7 and below), at least some of them could still produce simpler GPU designs at SMIC and address AI/HPC and/or gaming/entertainment market.
From China’s perspective as a country, AI and HPC-capable GPUs may be arguably more important than CPUs too since AI and HPC can enable all-new applications, such as autonomous vehicles and smart cities as well as advanced conventional arms. The U.S. government of course restricts exports of supercomputer-bound CPUs and GPUs to China in a bid to slowdown or even constrain development of advanced weapons of mass destructions, but a fairly sophisticated AI-capable GPU can enable an autonomous killer drone, and drone swarms represent a formidable force, for instance.
GPU Microarchitecture Is Relatively Easy, Hardware Design Is Expensive
Meanwhile, it should be noted that while there are a bunch of GPU developers, only two can actually build competitive discrete GPUs for PCs. That is perhaps because it is relatively easy to develop a GPU architecture, but it is truly hard to properly implement it and to design proper drivers.
CPU and GPU microarchitectures are essentially at the intersection of science and art. They are sets of sophisticated algorithms that can be developed by rather small groups of engineers, but they might take years to develop, says Peddie.
“[Microarchitectures] get done on napkins and white boards,” said Peddie. “[As for costs] if it is just the architects themselves, that [team] can be as low as one person to maybe three – four. [But] architecture of any type, buildings, rocket ships, networks or processors is a complicated chess game. Trying to anticipate where the manufacturing process and standards will be five years away, where the cost-performance tradeoffs are, what features to add and what to drop or ignore is very tricky and time-consuming work. […] The architects spend a lot of time in their head running what-if scenarios — what if we made the cache 25% bigger, what if we had 6,000 FPUs, should we do a PCIe 5.0 I/O will it be out in time.”
Since microarchitectures can take years to develop and they require talented designers, in a world where time-to-market is everything, many companies license an off-the-shelf microarchitecture or even a silicon-proven GPU IP from companies like Arm or Imagination Technologies. For example, Innosilicon — a contract developer of chips and physical IP — licenses GPU microarchitecture IP from Imagination for its Fantasy GPUs. There is another China-based GPU developer, which uses a PowerVR architecture from Imagination. Meanwhile, Zhaoxin uses a highly reiterated GPU microarchitecture it acquired from Via Technologies, which inherited it from S3 Graphics.
The cost of developing a microarchitecture may vary, but it is relatively low when compared to the costs of a physical implementation of a modern high-end GPUs.
For years, Apple and Intel, both companies with plenty of engineering talent, relied on Img for their GPU designs (Apple still does to a certain extent). MediaTek and other smaller SoC suppliers rely on Arm. Qualcomm used ATI/AMD for an extended period, and Samsung uses AMD after several years of trying to design their own graphics engine.
Two of the new Chinese companies have hired ex AMD and Nvidia architects to start their GPU companies, and another two use Img. Time to market and learning the skills of being an architect, what to worry about, and how to find a fix is a very time consuming process.
“If you can go to a company that already has a design and have been designing for a long time, you can save a boatload of time and money – and time to market is everything,” said the head of Jon Peddie Research. “There are just so many gotchas. Not every GPU designed by AMD or Nvidia has been a winner. [But] a good design lasts a couple of generations with tweaks.”
Hardware implementation and software development are prohibitively expensive with new production nodes. International Business Times estimates that the design costs for a fairly complex device made using 5nm-class technology exceeds $540 million. These costs will triple at 3nm.
“If you include layout and floor plan, simulation, verification, and drivers then the [GPU developer] costs and time skyrocket,” explained Peddie. ” The hardware design and layout is pretty straight forward: get one trace wrong and you can spend months tracking it down.”
There are just a few companies in the world that can develop a chip featuring the complexity of modern gaming or compute GPUs from AMD and Nvidia (46 billion — 80 billion transistors), yet China-based Biren could do something similar with its BR104 and BR100 devices (we speculate that the BR104 packs some 38.5 billion transistors).
Thoughts
Despite prohibitive costs, eight out of the 11 PC/datacenter GPU designers are from China, which speaks for itself. Perhaps we won’t see a competitive discrete gaming GPU from anyone except huge American companies in the near future. That’s partly because its hard and time consuming to develop a GPU, and to a large degree it requires a prohibitively expensive hardware implementation for these high-complexity GPUs. Whether or not China can field competitive entrants remains to be seen, but any failure won’t stem from a lack of trying.