After Nvidia acquired 3DFX in 2000 and AMD acquired ATI in 2006, the desktop GPU market has already settled.
Nvidia is a well-deserved giant in the GPU market, AMD’s GPU is struggling, while Intel relies on the appeal of its CPU and has absolute appeal in the integrated graphics market, but in the discrete graphics market, whether it is the previous Intel 740 or the follow-up The Larrabee died without a disease.
But after Alex Krizhevsky used Nvidia GPU to successfully train the deep convolutional neural network AlexNet, and relying on this network to greatly improve the performance in the field of image classification and recognition. The new era of artificial intelligence has officially opened. It is from this time that the GPU market has entered a new stage. Nvidia has become a well-deserved winner of this era.
Nvidia’s stock price trend from 2012 to the present
Nvidia’s two great weapons in the AI era
Looking back at the history of the development of graphics processors, according to relevant data, the Whirlwind manufactured by MIT in 1951 may be the world’s first 3D graphics system, but this is not the basis of modern GPUs. According to reports, the current GPU prototype is based on the so-called video shifters and video address generators in the mid-1970s.
After the development of large-scale systems and small-scale workstations, the graphics processor has flourished in 3D games on PCs in the mid to late 1990s. During this period, many companies poured into it, and Nvidia was one of them. According to Nvidia’s official website, in 1993, when they were founded, there were more than 20 graphics chip companies in the world. By 1997, this number had soared to 70. But by 2006, Nvidia was the only independent company still operating, and they became the final winner. The front waves that have washed up on the beach include competitors such as ATI, S3 Graphics, and 3DFx.
Like other players, NVIDIA only focused on the graphics card market when it was first established. The initial two products, NV1 and NV2, also received mediocre market response. But Nvidia was not discouraged and invested a lot of experience in developing NV3, which was launched in 1997. As the world’s first 128-bit 3D processor, NV3 shipped over one million four months after its launch. Because NV3 can support OpenGL well, since NV3, Nvidia gradually defeated 3DFx, which accounted for 85% of the market at the time, and became the dominant player in the graphics card market.
It is worth mentioning that Nvidia stated that they invented the GPU in 1999 (this is NVIDIA’s first vocabulary, GPU is the abbreviation of Graphics Processing Unit), and the introduction of GeForce 256 that year was the world’s first GPU.
If NVIDIA continues to focus only on the graphics market, then their best is the next 3DFx, but Huang Renxun has greater ambitions to push the GPU to the general-purpose market. This is the familiar GPGPU.
According to a previous report from the semiconductor industry observation: “Academic circles became interested in the use of GPUs for general-purpose computing (GPGPU) around 2000. At that time, CPUs, mainly for the execution of general-purpose algorithms, were the main force for scientific computing. General algorithms have good performance, so a lot of chip area is actually used for on-chip memory and branch prediction control logic, but there are not many units that are actually used for calculation. On the contrary, the control logic in the GPU architecture is relatively simple. Most of the chip area is used for calculations such as rendering and polygons. The academic community has found that calculations such as matrices in scientific operations can be easily mapped to the processing unit of the GPU, so very high computing performance can be achieved.”
The report further pointed out that at that time, the main bottleneck of GPGPU was that it was difficult to use. Since GPU is developed for image applications, it is not easy to support general high-performance computing in its programming model. It requires a lot of manual debugging and coding, which creates a high threshold, and there are not many people who can use it proficiently.
In order to make GPU from both hardware and software to universal, Nvidia launched the Tesla architecture in 2006, which changed the previous practice of using vector computing units for rendering, instead splitting one vector computing unit into multiple scalar computing rendering units. This makes GPUs based on this architecture in addition to strong rendering capabilities, but also suitable for general-purpose computing.
That is, in this year, Nvidia launched CUDA. According to them, this is a revolutionary architecture for general-purpose GPU computing. CUDA will enable scientists and researchers to use the parallel processing capabilities of GPUs to meet their most complex computing challenges.
It is precisely thanks to the layout of these two directions that NVIDIA is in the AI era like a duck in water.
According to industry experts, in the current cloud AI chip market, with the exception of Google’s own TPU, most other manufacturers use NVIDIA GPUs for related model training, which makes NVIDIA a high market for cloud AI chips. No less. This has also allowed NVIDIA’s performance to hit new highs repeatedly in the past few years. According to CCID Guwen’s forecast data, the cumulative growth of the domestic cloud AI chip market from 2019 to 2021 will be as high as 152%. McKinsey also predicts that the training market will grow rapidly in the next few years. In the next ten years, this will be the world of NVIDIA GPUs.
Seeing this data demand and forecast, ASIC products such as Graphcore IPU and Google TPU have emerged abroad, planning to challenge Nvidia in the training market. Intel and AMD hope to carry Nvidia on the GPU.
AMD and Intel are ready to move
In fact, before and after NVIDIA’s entry into GPGPU, AMD also had a corresponding plan. But unlike NVIDIA’s efforts to implement the CUDA development environment in the past years, AMD puts eggs in the “OpenCL” basket, which leads to even though they released the ROCm platform in 2017 to provide deep learning support, but It can’t change the outcome of their GPUs in the AI era.
But AMD is not reconciled. In order to compete with Nvidia, AMD launched a new cDNA architecture in March of this year. According to reports, this is AMD’s GPU architecture for data centers and other purposes, focusing on computing. AMD’s goal for CDNA is simple and straightforward: to build a large, powerful GPU series that is optimized for general computing and data center use.
According to reports, a large part of the performance improvement in the new architecture will be reflected in machine learning, which means that it supports faster execution of smaller data types (such as INT4 / INT8 / FP16), and AMD also introduced the new architecture. Explicitly mentioned tensor operations. In addition, the new architecture can flexibly design performance through the Infinity Fabric interconnect bus, and supports enhanced enterprise-level RAS features, security, and virtualization technologies. It will also provide a higher energy efficiency ratio, thereby reducing enterprise TCO costs.
Based on this architecture, AMD released a new generation of Instinct MI100 computing card in the middle of this month. The data shows that the new architecture can provide up to 11.5 TFLOPS of FP64 peak throughput, which makes it the first GPU to break 10 TFLOPS in FP64. Compared with the previous generation MI50, the performance of the new accelerator card has been improved by 3 times. It also has a peak throughput of 23.1 TFLOPS in the FP32 workload. The data shows that AMD’s new accelerator card beats Nvidia’s A100 GPU in both categories.
Instinct MI100 also supports AMD’s new Matrix Core technology, which can improve the performance of single-precision and mixed-precision matrix operations such as FP32, FP16, bFloat 16, INT8 and INT4, and can also increase FP32 performance to 46.1 TFLOPS.
In order to better compete with Nvidia, AMD also stated that its open source ROCm 4.0 developer software now has an open source compiler and unified support for OpenMP 5.0, HIP, PyTorch and Tensorflow.
In addition to AMD, Intel has also increased investment in its GPUs in recent years, hoping to get a share of the AI market.
According to Intel, the company’s Xe architecture GPU will cover everything from integrated graphics to high-performance computing. Among them, the independent GPU codenamed Ponte Vecchio is the company’s design for HPC modeling and simulation and AI training. Ponte Vecchio will be manufactured using Intel’s 7-nanometer technology and will be Intel’s first Xe-based GPU optimized for HPC and AI workloads. But until now, I haven’t seen this new Intel product.
In addition, in order to better use its chips including CPU, GPU, FPGA and AISC in the application market including AI as an example, to facilitate developers’ programming, Intel has also launched the OneAPI with far reaching ideals. From the developer’s point of view, this is a good plan, but it is also a very challenging task.
Chinese manufacturers accelerate entry
Today, when the importance of GPU is becoming more and more prominent, more and more domestic manufacturers are beginning to invest in this market. In addition to Jingjiawei, Zhaoxin and Hangjin, which have been deployed in this market before, these manufacturers also have some new companies entering this field. Among them, Biya, Mu Xi, Haifeike, and Xintong are the most well-known.
First look at Biren Technology. According to the official website, the company was founded in 2019. The team is composed of core professionals and R&D personnel in the field of chip and cloud computing at home and abroad. It has deep expertise in GPU, DSA (dedicated accelerator) and computer architecture. Technology accumulation and unique industry insights.
In terms of products, Biren Technology is committed to developing original universal computing systems, establishing efficient software and hardware platforms, and providing integrated solutions in the field of intelligent computing. From the development path, Biren Technology will first focus on general-purpose intelligent computing in the cloud, and gradually catch up with existing solutions in artificial intelligence training and reasoning, graphics rendering, high-performance general-purpose computing and other fields, and realize the development of domestic high-end general-purpose intelligent computing chips. breakthrough.
Looking at Mu Xi again, it is a company founded by former AMD executives leaving. According to reports, Muxi Integrated Circuit was established in September 2020. The core team comes from a world-class GPU chip company, with an average of more than 15 years of high-performance GPU chip design experience and rich experience in 5nm tape-out and 7nm chip mass production. The company is committed to the development and production of safe and reliable high-performance GPU chips with independent intellectual property rights, serving data centers, cloud games, artificial intelligence and many other important fields that require high computing power, and filling the gap in the independent control of domestic high-performance GPU chips.
Hexaflake was established in 2019. It is a high-tech start-up company dedicated to the research and development of AI high-performance processor chips and software and hardware full-stack system solutions. It is the head that can keep pace with international giants in this field. AI general-purpose processor company. The main founders and core team gather a number of top international senior experts from all over China and the United States; expertise covers the development of parallel computing and AI processor architecture, GPU and other ultra-large-scale SoC chips and processor system software; and has worked in the core research and development of international leading companies He has worked in the department for a long time and has successfully developed a variety of chips and system products. The purpose of their company is to jointly create a new generation of general-purpose AI processor chips and their software and hardware ecological environment.
Xintong Semiconductor was established in 2018. In an interview with the media, they said that the company’s GPU targets three application areas: embedded, office PC and cloud gaming. In addition, Core Motion, which has authorized Imagination IP, Zhaoxin, which has inherited related GPU patents, and Godson, which has been making domestic CPUs, are also players in the GPU market.
Taking into account the current situation of domestic GPUs and the trade situation between China and the United States, the aforementioned GPU manufacturers have not only players who are fancy AI market, but also entrepreneurs who hope to make breakthroughs in the graphics GPU market.
But as industry experts told me, whether it is in the graphics or general-purpose computing market, for GPUs, the software and developer ecology are more important. Only when this is done is the prerequisite for GPUs to be commercially available. When will one of the domestic manufacturers be able to truly break through? This is worth our wait and see.