The first year of PCIe 6.0, AI and HPC usher in new speeds

Posted Date: 2024-02-01

Electronic Enthusiast Network reports (Text/Zhou Kaiyang) In January 2022, PCI-SIG released the PCIe 6.0 specification, officially kicking off a major upgrade in interface bandwidth. However, in the two years since the specification was released, versions 6.0.1 and 6.1 have been updated, and PCIe 6.0 seems to have been perfected at the design level. However, we have not seen the actual implementation of PCIe 6.0 products during this period, and even PCIe 5.0 products have only been implemented in small batches of consumer-grade products. In March of this year, according to PCI-SGI predictions, PCIe 6.0 hardware was finally about to be released.

AI/MLHPC and cloud graphics workloads crave double the bandwidth

As an interface that originated from servers and PCs, PCIe has begun to radiate into data-centered applications in recent years, such as IoT, automobiles, and medical electronics. Starting from PCIe 3.0, each generation of interface standard specifications will double the speed. PCIe 6.0 will increase the speed to 64GT/s, and the x16 one-way bandwidth will be increased to 256GB/s.

Such a rate may seem a bit redundant in IoT applications, but in the currently popular AI/ML applications, it can be said to be a trump card for greatly improving computing efficiency. You must know that the currently popular generative AI applications such as ChatGPT are plagued by a large number of simultaneous accesses and demands. No matter how high the bandwidth of the H100 is, the computing efficiency is still limited under PCIe 5.0, and NVLink is required to break through the limitations.

The current data center, especially the AI ​​servers in it, has become a bottomless pit of bandwidth. What's more, when it comes to dealing with AI/ML training models, winning quickly can be said to be an ironclad market rule. Faster bandwidth means faster concurrent response of generative AI, giving shorter text and graphics generation time, thereby further improving user experience. For most application developers who host/rent servers, this also means a significant reduction in operating costs.

HPC is no exception. In HPC servers that run for a long time, bandwidth means computing efficiency. Especially software used in quantum mechanics and molecular dynamics requires higher memory bandwidth. If there is not enough bandwidth, nodes can only be divided to reduce bandwidth competition between different processes. The last big pain point is caused by cloud games. For applications that require multiple concurrencies and high graphics loads, bandwidth determines the number of concurrencies and graphics quality of cloud games.

AI/MLWill it accelerate the implementation of PCIe 6.0?

Judging from the development history of PCIe, a new specification will be officially launched every 3 to 4 years on average. Only from 3.0 to 4.0 it took 7 years. This is also because PCIe 3.0 at that time was already an advanced standard. After 2015, global data traffic entered an explosive period. It took two years to evolve from 4.0 to 5.0, and three years to evolve from 5.0 to 6.0. It is speculated that the PCIe 7.0 specification will be officially launched in 2025.

However, we also found that PCIe products based on the new standard are not that fast in implementation. Equipment such as AI accelerators and SSDs are often the first to be implemented. After all, no matter how fast the local processing speed is, the network speed must keep up. Take 800G Ethernet, which has not yet been implemented, as an example. The one-way bandwidth it requires is 100GB/s, which is exactly within the bandwidth coverage of PCIe 6.0.

Therefore, even with the advancement of AI/ML, only a few PCIe 6.0 products should be launched this year, and the landing scenarios are still concentrated in the data center field, and will be used in AI servers first. This point starts from the PCIe 6.0 controller IP, server This can also be seen in the trends of CPU IP manufacturers.

write at the end

As one of the most advanced interface standards at present, PCIe 6.0 is destined to bring design convenience to the next round of innovation in computing and storage technology. But we should also pay attention to market demand in real time. In the past, PCIe was able to maintain its status as a mainstream interface by uniting hardware manufacturers to steadily promote market demand. This strategy is still common in the AI ​​era.

