NVIDIA DGX-2 is the world’s most powerful hardware for the most complex AI challenges and also complex software stack including all the most used enviroment (TensorFlow, Caffe, Torch, Theano, …). The NVIDIA DGX-2 is an artificial intelligence supercomputer, the first 2 petaFLOPS system that combines 16 fully interconnected GPUs for 10X the deep learning performance. The latest addition to the DGX family of systems is the DGX-2H – DGX-2, tuned to achieve the highest performance.
Let’s take a look at NVIDIA DGX-2 and DGX-2H in more detail, first from a hardware standpoint.
|GPUs||16× NVIDIA Tesla V100 32GB||16× NVIDIA Tesla V100 32GB|
|Performance (tensor operace)||2 .1 PetaFLOPS||2 PetaFLOPS|
|GPU memory||512 GB total||512 GB total|
|CPU||2× Platinum 8174, 3.1 GHz (24 cores)||2× Platinum 8168, 2.7 GHz (24 cores)|
|NVIDIA CUDA cores||81 920||81 920|
|NVIDIA Tensor cores||10 240||10 240|
|RAM||1,5 TB||1,5 TB|
|HDD||2× 960GB NVMe SSD, 8× 3.84TB NVMe SSD||2× 960GB NVMe SSD, 8× 3.84TB NVMe SSD|
|Network||2× 10/25Gb Ethernet, 8× 100Gb Infiniband/Ethernet||2× 10/25Gb Ethernet, 8× 100Gb Infiniband/Ethernet|
|Maximum input power||12 kW||10 kW|
|Type||rack, 10U||rack, 10U|
With DGX-2, model complexity and size are no longer constrained by the limits of traditional architectures. Now, you can take advantage of model-parallel training with the NVIDIA NVSwitch networking fabric. It’s the innovative technology behind the world’s first 2-petaFLOPS GPU accelerator with 2.4 TB/s of bisection bandwidth, delivering a 24X increase over prior generations.
But what is more interesting is the already mentioned software package offered with NVIDIA machines. NVIDIA GPU Cloud provides easy access to a comprehensive catalog of GPU-optimized software. It features performance-engineered containers with all the top deep learning frameworks such as TensorFlow, PyTorch, MXNet, and more, tuned, tested, certified, and maintained by NVIDIA. It also includes third-party managed containers for HPC applications, and NVIDIA containers for HPC visualization. NVIDIE provides 30% more performance for machine learning applications against applications deployed purely on NVIDIA hardware. The main advantage of the pre-installed environment is the deployment speed, which is in units of hours.
The strength of the NVIDIA solution is to support the entire system. Hardware support (in case of failure of any of the components) is a matter of course. Software support for the entire environment is critical if something does not work. There is hundreds of developers ready to help. Support is part of NVIDIA DGX purchase. It is available for 1 or 3 years and can be further extended after this time.
With a combination of tuned hardware, software and NVIDIA support, NVIDIA DGX delivers significantly higher performance and acceleration in the learning phase of machine learning applications: