What GPU is the best choice for AI training in 2020?

I have dedicated my website to the hardware basics of gpus: lulaoshi.info/gpu/gpu-bas… . Nvidia has designed a mixed precision Core like Tensor Core in its newer microarchitecture to optimize matrix operations for deep learning, so it’s best to use a GPU with Tensor Core for AI training.

I have some cooperation with Didi Cloud. Friends without GPU can go to Didi Cloud to purchase GPU/vGPU/ machine learning products. Remember to enter AI master code: 1936, and you can enjoy a 10% discount. Time-sharing charging of GPU products is more cost-effective than buying hardware by yourself. Please go to http://www.didiyun.com.

As we all know, today’s state-of-the-art deep learning models take up a lot of video memory, and many gpus that used to be robust may now be a little short of memory. Lambda laboratory released in February 2020, an article on graphics transverse evaluation lambdalabs.com/blog/choosi… , explores which Gpus can train models without memory errors, and these graphics cards are better suited for PCS and small workstations. The core conclusion of this article is that video memory size matters. Yes, video memory size is limiting the training of many deep learning models.

Thanks to rapid advances in deep learning, the 12 gigabytes of memory that used to rule the world no longer exist. In February 2020, you’ll need at least $2,500 for nvidia’s latest Titan RTX to barely get past the best models in the industry, and it’s hard to imagine what will happen by the end of the year.

consumer

For individual users, Nvidia’s consumer GeForce series is preferred. The more economical options are:

  • GeForce RTX 2080 Ti: $1200, 11GB video storage, Turing Microarchitecture (supports Tensor Core)
  • Titan RTX: $2500, 24GB of video storage, Turing Micro Architecture (supports Tensor Core)

It is important to note that these consumer graphics support for parallel card is bad, by default, they do not support more direct communication between card, if we want to communicate with each other between card 1 and 2, then the data will be from memory card 1 copy back into main memory, through the PIC – E bus again from main memory by PCI – E copy into the memory card 2, this is clearly a waste of time, Bad for communication between multiple cards. The 2080 Ti and Titan RTX don’t support PEER-to-peer communication over PCI-E channels between multiple cards, but that doesn’t mean they don’t support NVLink. Users can buy NVLink Bridges to build communication channels between multiple cards. Some have called the issue a design flaw in the two Gpus, while others believe Nvidia is deliberately trying to get people with multi-card parallel computing needs to buy Telsa gpus.

enterprise

GPU products in data centers are more expensive, suitable for enterprise users, have higher video memory, and can better support multi-card parallelism.

  • Quadro RTX 6000: $4000, 24GB of video storage, Turing Micro Architecture (supports Tensor Core)
  • Quadro RTX 8000: $5,500, 48GB of video storage, Turing Micro Architecture (supports Tensor Core)
  • Telsa V100:16 or 32GB video memory in two versions, PCI-E and NVLink in two versions, Volta microarchitecture (supports Tensor Core)
  • Telsa V100S: 32GB memory, PCI-E bus, Volta Microarchitecture (support for Tensor Core)

Enterprise-class Gpus generally have to be plugged into servers or workstations, and these servers and workstations themselves are not cheap, especially those that support Telsa platforms at the 100,000 yuan level. Of course, there is no consideration of machine room construction, electricity costs.

The new Ampere microarchitecture and Telsa A100 graphics card were announced at nvidia’S GTC 2020 in May 2020. The A100 has enhanced AI training and reasoning capabilities, and a single A100 can be split into up to seven independent Gpus to handle a variety of computing tasks.

Multi – card parallel training task friends, recommended to support NVLink Telsa series graphics cards.

summary

GeForce RTX 2080 Ti (11GB) may be the starting standard if deep learning research is carried out. Titan RTX (24GB) is a good option, combining price, video memory, and computing performance. For enterprise users, graphics cards such as Quadro RTX 8000 (48GB) and Telsa V100 (32GB) are suitable for researchers at the forefront of deep learning. In the second half of 2020, Nvidia is shipping new computing platforms that will both deliver more performance and lower prices on existing products.

At a time when physical hardware is expensive, maybe we should look to cloud Gpus.