PNY realizes and appreciates that scientists, researchers, and engineers are solving the world’s most important data science and big data analytics challenges with AI and high-performance computing (HPC). Businesses, even entire industries, harness the power of AI to extract new insights from massive data sets, both on-premises and in the cloud. New NVIDIA® Ampere architecture-based GPUs, designed for the age of elastic computing, deliver the next giant leap by providing unmatched acceleration at every scale, enabling innovators to push the boundaries of human knowledge forward.
NVIDIA’s latest products implement ground breaking innovations. Third-generation Tensor Cores deliver dramatic speedups to AI, reducing training times from weeks to hours and provides massive inference acceleration. Two new precisions – Tensor Float (TF32) and Floating Point 64 (FP64) accelerate AI adoption and extend the power of Tensor Cores to use cases requiring double precision HPC.
TF32 works just like FP32 while delivering speedups of up to 20x for AI facets of data science and analytics without requiring any code changes. Using NVIDIA Automatic Mixed Precision, data scientists can gain an additional 2x performance with automatic mixed precision and FP16 by adding just a couple of lines of code. With support for bfloat16, INT8, and INT4, Ampere Tensor Cores are an incredibly versatile accelerator for AI training and inference.
Every AI, data science, analytics, and HPC application can benefit from acceleration, but not every application needs the performance of a full Ampere GPU. The NVIDIA A100 implements a powerful feature, Multi-Instance GPU (MIG), which can partition each GPU into up to seven GPU instances, fully isolated and secured at the hardware level with their own high-bandwidth memory, cache, and compute cores. This brings breakthrough acceleration to a wide range of data science and analytics scenarios, big and small, and delivers guaranteed quality of service. IT administrators can offer right-sized GPU acceleration for optimal utilization and expand access to every user and application across bare-metal and virtualized environments.
The NVIDIA A100 PCIe brings massive amounts of compute performance to data centers in industry standard servers. The A100 also has significantly more on-chip memory, including a 40-megabyte (MB) level 2 cache – 7x larger than the previous generation – to maximize compute performance. The A100 PCIe board offers 40 GB of HBM2 GPU memory, with a memory bus width of 5120 bits and a peak memory bandwidth of up to 1555 GB/sec, easily taking the performance crown from the prior generation V100 PCIe.
Scaling applications across multiple GPUs requires extremely fast movement of data. Third generation NVIDIA NVLink as implemented by the NVIDIA A100 PCIe doubles the GPU-to-GPU direct bidirectional bandwidth to 600 gigabytes per second (GB/sec), almost 10x higher than PCIe Gen 4.
Contemporary AI networks are big and getting bigger, with millions and in some cases billions of parameters. Not all of these are necessary for accurate predictions and inference, and some can be converted to zeros to make models “sparse” without compromising accuracy. Ampere Tensor Cores provide up to 2x higher performance for sparse models. While the sparsity feature more readily benefits AI inference, it can also be used to improve the performance of model training.
The NVIDIA Ampere architecture’s second-generation RT Cores in the NVIDIA RTX A6000 and NVIDIA A40 GPUs deliver massive speedups for big data analytics, data science, AI, and HPC use cases where seeing (visualizing) the problem is essential to solving the problem. RT Cores enable real-time ray tracing for photorealistic results and work synergistically with Tensor Cores to deliver AI denoising and other features.
To learn more about how NVIDIA Ampere architecture products like the NVIDIA RTX A6000, A40, or the A100 PCIe can reimagine your AI, data science, big data analytics, and HPC workloads contact PNY at gopny@pny.com or visit www.pny.com/nvidia-rtx.
NVIDIA GTC 21, a virtual event taking place April 12-16, 2021 offers a great resource for additional information on how NVIDIA Ampere architecture-based GPUs are transforming data science and big data analytics. PNY strongly suggests you register for this upcoming meeting of the best and brightest minds in these fields. Learn more by visiting www.pny.com/gtc.