OctoML Unveils Next Iteration Of ML Deployment Platform To Scale ML Operations
OctoML announced the latest release of its Machine Learning (ML) Deployment Platform to empower enterprises to scale their ML operations (MLOps). Launched at TVMcon 2021, the open source conference for ML acceleration, the new release enables enterprises to automate the optimization, performance benchmarking and deployment of production-ready ML models across the broadest array of clouds, hardware devices and ML acceleration engines.
The new platform supports the three leading clouds (AWS, Microsoft Azure and Google Cloud Platform), and a wide choice of hardware options, including NVIDIA GPUs, Intel and AMD CPUs, as well as leading edge platforms like NVIDIA Jetson and Arm Cortex-A.
Recommended AI News: Brinc Closes $130 Million Funding, Led by Animoca Brands, to Launch Web 3.0-Focused Accelerators and Fuel Global Expansion
“Enterprises today face significant challenges with scaling the deployment of their trained models. In fact, research shows that nearly two-thirds of models take over a month to deploy into production,” said Luis Ceze, CEO, OctoML. “This is because model performance tuning and optimization is largely done manually. Also, models, software platforms, and inference targets are rapidly evolving, requiring highly skilled resources on an ongoing basis. This latest iteration breaks these bottlenecks, making machine learning economically viable and enabling faster innovation.”
A number of OctoML customers are already using the new platform to power their ML model “factories” where trained ML models enter the platform and the output is a package containing that same model—accelerated across the users’ chosen deployment targets. OctoML customers are now able to—through either UI or API-driven workflows—complete dozens of accelerations a week. Customers are also able to leverage performance benchmarking insights to dramatically improve their time to market and reduce the cost of inference serving through model speed-up.
Recommended AI News: AccuWeather Announces Live Node on Chainlink, Making Weather-Based Blockchain Applications Possible
Benefits and features of the new OctoML platform include:
- Expanded choice of deployment targets
- Microsoft Azure target support provides choice across all three major clouds, including AWS and Google Cloud Platform..
- AMD and Intel CPUs and NVIDIA GPUs are target options in each cloud.
- Extensive edge support with NVIDIA Jetson AGX Xavier, and Jetson Xavier NX to go along with Arm A-72 CPUs using 32 and 64 bit OSs.
- Microsoft Azure target support provides choice across all three major clouds, including AWS and Google Cloud Platform..
- Pre-accelerated Model Zoo which includes:
- A Computer Vision (Object Classification and Image Detection) set that includes ResNet, YOLO, Mobilenet, Inception, and more.
- A Natural Language Processing (NLP) set that includes BERT, GPT-2, and more.
- Improved performance across the widest breadth of ML models
- Expanded model format support that includes ONNX, TensorFlow Lite, and several TensorFlow model packaging formats—so users can upload their trained models without conversion.
- Three new acceleration engines: ONNX Runtime, TensorFlow, and TensorFlow Lite in addition to TVM to provide the optimal performance acceleration and insights for every model.
- TVM performance speedups across the Model Zoo have a geomean of 2.2x across CPU and GPU compared to the TensorFlow baseline.
- Streamlined approach to provide data-driven decisions
- Enhanced performance benchmarking comparison workflows enable swift decision-making.
“We’re pleased to support OctoML with the power of Microsoft Azure,” said John Montgomery, CVP, Azure AI, Microsoft. “The new platform release not only offers customers more automation, choice and performance in their ML journey, but also allows enterprises to take advantage of the security, flexibility and reliability Azure provides.”
Recommended AI News: Celsius Integrated Tezos
Comments are closed.