cnvrg.io AI OS Delivers Accelerated ML Workloads of All Sizes with Native Support of NVIDIA A100 Multi-Instance GPU
Integration with NVIDIA MIG expands performance by delivering multiple instances of a single GPU on demand for ML/DL workloads in one click
cnvrg.io AI OS for machine learning announces native integration of NVIDIA multi-instance GPU (MIG) technology to its data science platform. cnvrg.io is the first ML platform to integrate MIG – a groundbreaking new feature that can partition each NVIDIA A100 GPU into as many as seven accelerators for optimal utilization, effectively expanding access to every user and application. This integration follows the release of the NVIDIA A100 Tensor Core GPU and NVIDIA DGX A100 system, and cnvrg.io’s certification for the NVIDIA DGX-Ready Software program as an AI workflow solution.
NVIDIA GPUs are the powerhouse of machine learning and deep learning workloads. The new NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration at every scale for AI, data analytics, and HPC workloads to tackle the world’s toughest computing challenges. As the engine of the NVIDIA data center platform, A100 can efficiently scale up to thousands of GPUs, and with MIG, it can be partitioned into seven isolated GPU instances to accelerate AI workloads of all sizes.
Recommended AI News: AYCE and ZeroTouch Sign Partnership for Contactless Mobile Payment
With such immense variability, resource management is essential. Infrastructure teams require MLOps, as well as a way to assign, schedule, share, and monitor utilization of the MIG resources. This is where cnvrg.io data science platform was quick to evolve and offer MIG integration with self-service resource management, meta-scheduling and MLOps capabilities.
cnvrg.io is amongst the first ML platforms to integrate NVIDIA’s new MIG feature, which expands the performance and value of each NVIDIA A100 Tensor Core GPU across one or more DGX A100 systems. Each MIG instance is fully isolated with its own high-bandwidth memory, cache, and compute cores. Now administrators can support every workload, from the smallest to the largest, offering a right-sized GPU with guaranteed quality of service (QoS) for every job, optimizing utilization and extending the reach of accelerated computing resources to every user. Paired with cnvrg.io’s industry-leading resource management and MLOps capabilities for IT and AI teams, MIG instances can now be utilized in one click by any data scientist performing any ML job. cnvrg.io’s integration of NVIDIA MIG delivers AI teams:
- On-demand MIG instance availability in one click
- Customized compute templates for simplified MIG allocation and utilization
- Access to NVIDIA A100 processing power for any data scientist to run any ML job
- Advanced meta-scheduling capabilities for MIG pool management
Recommended AI News: Crocus Technology Introduces Its 2nd Gen XtremeSense TMR 2D Angular Sensor
“As data scientists, we understand that the MIG capability has the power to transform how data scientists work, and opens the possibilities for a significantly wider access to utilization of NVIDIA GPU technology,” says Yochay Ettun, CEO and co-founder of cnvrg.io. “Working closely with NVIDIA, we are helping make it easier for organizations to leverage MIG in A100 in their AI workflows, and deliver agility and productivity to AI teams.”
“The MIG capability in the A100 GPU allows users to pack dozens of completely separate jobs in parallel on DGX A100 systems with outstanding performance,” said John Barco, senior director of product management at NVIDIA. “Complementing this tremendous technological innovation with enterprise-grade AI workflow management and MLOps software from cnvrg.io helps customers maximize their GPU utilization and get the most from their AI investment.”
AI teams can drastically improve utilization of their A100 GPUs with NVIDIA MIG and cnvrg.io. If you’re looking to optimize the usage of your DGX A100 systems with A100 GPUs, cnvrg.io now delivers the tools to quickly utilize the MIG feature in A100 for ML workloads and improve performance with advanced resource management and MLOps.