NVIDIA Sets AI Inference Records, Introduces A30 and A10 GPUs for Enterprise Servers
NVIDIA AI Platform Smashes Every MLPerf Category, From Data Center to Edge
NVIDIA announced that its AI inference platform, newly expanded with NVIDIA A30 and A10 GPUs for mainstream servers, has achieved record-setting performance across every category on the latest release of MLPerf.
MLPerf is the industry’s established benchmark for measuring AI performance across a range of workloads spanning computer vision, medical imaging, recommender systems, speech recognition and natural language processing.
Debuting on MLPerf, NVIDIA A30 and A10 GPUs combine high performance with low power consumption to provide enterprises with mainstream options for a broad range of AI inference, training, graphics and traditional enterprise compute workloads. Cisco, Dell Technologies, Hewlett Packard Enterprise, Inspur and Lenovo are expected to integrate the GPUs into their highest volume servers starting this summer.
Recommended AI News: Paul Nizov Joins ABBYY’s Leadership Team as Chief Information Security Officer
NVIDIA achieved these results taking advantage of the full breadth of the NVIDIA AI platform ― encompassing a wide range of GPUs and AI software, including TensorRT and NVIDIA Triton™ Inference Server ― which is deployed by leading enterprises, such as Microsoft, Pinterest, Postmates, T-Mobile, USPS and WeChat.
“As AI continues to transform every industry, MLPerf is becoming an even more important tool for companies to make informed decisions on their IT infrastructure investments,” said Ian Buck, general manager and vice president of Accelerated Computing at NVIDIA. “Now, with every major OEM submitting MLPerf results, NVIDIA and our partners are focusing not only on delivering world-leading performance for AI, but on democratizing AI with a coming wave of enterprise servers powered by our new A30 and A10 GPUs.”
Recommended AI News: Etisalat and Nokia Provide Ultra-Fast 5G Broadband Services in the UAE
MLPerf Results
NVIDIA is the only company to submit results for every test in the data center and edge categories, delivering top performance results across all MLPerf workloads.
Several submissions also use Triton Inference Server, which simplifies the complexity of deploying AI in applications by supporting models from all major frameworks, running on GPUs, as well as CPUs, and optimizing for different query types including batch, real-time and streaming. Triton submissions achieved performance close to that of the most optimized GPU implementations, as well as CPU implementations, with comparable configurations.
Recommended AI News: TrustSwap Announces Incubator Program, Providing Support and Guidance for Crypto Startups
NVIDIA also broke new ground with its submissions using the NVIDIA Ampere architecture’s Multi-Instance GPU capability by simultaneously running all seven MLPerf Offline tests on a single GPU using seven MIG instances. The configuration showed nearly identical performance compared with a single MIG instance running alone.
These submissions demonstrate MIG’s performance and versatility, which enable infrastructure managers to provision right-sized amounts of GPU compute for specific applications to get maximum output from every data center GPU.
In addition to NVIDIA’s own submissions, NVIDIA partners Alibaba Cloud, Dell Technologies, Fujitsu, GIGABYTE, HPE, Inspur, Lenovo and Supermicro submitted a total of over 360 results using NVIDIA GPUs.
Recommended AI News: DeFi Technologies and HIVE Blockchain Technologies Complete Share Exchange
Comments are closed.