Operational Principles Can Help Manage the Lifecycle of ML Engineering and Deployment
Machine learning (ML) is a subset of artificial intelligence (AI) that enables computers and other machines to perform tasks usually done by humans. Because it allows machines to perceive, learn from, abstract, and act on data, ML has the potential to transform how organizations approach problem-solving and day-to-day operations.
But moving ML—and AI in general—from the lab to the real world is easier said than done.
According to Gartner, through 2022, “85% of AI projects will deliver erroneous outcomes due to bias in data, algorithms or the teams responsible for managing them.” In addition, IDC reports that 28% of ML initiatives fail due to a lack of staff with the necessary expertise, production-ready data, or an integrated development environment.
This article explores the potential of [X]Ops principles – be it Development Operations (DevOps) for software, Data Operations (DataOps) for data, or Machine Learning Operations (MLOps) for AI models. These principles paired with a Machine Learning Delivery Platform (MLDP) can help organizations overcome the challenges of operationalizing and scaling AI. How? They form a set of practices that standardizes collaboration throughout the engineering and deployment lifecycle to move ML from the pilot realm to deliver operational ROI.
“Real-world” AI/ML is Different
Building and training a model and then testing it in the laboratory is an AI/ML initiative’s first step. That’s where most organizations currently stand, and many lab-built models have yielded positive results.
This kind of “pilot AI” normally involves highly trained experts developing and deploying a model via clear, static problems and controlled information inputs. This is done under ideal conditions in a high-performance cloud environment over a high-speed network.
Outside the lab, however, the situation is often different. Poor connectivity, rigid architecture, legacy data practices and systems, and stakeholders who don’t fully understand or buy in to the technology can challenge an AI/ML model’s performance.
With “operational ML,” the model needs real-time updating, with monitoring and performance data readily available for future use. Analytical ML solutions simply aren’t created with that kind of flexibility and scalability in mind. It’s just not intuitive in that modeling framework to think about building something that needs to be containerized for deployment.
A New Approach: Agile Principles Meet ML Engineering
With MLOps, a set of practices that allows for standardized collaboration between teams, organizations can manage the lifecycle of ML engineering and deployment.
Like DevOps, MLOps is rooted in agile practices: cross-functional endeavors focused on continuous improvement, process improvement, and efficiency. Process flows are smoother than with traditional waterfall handoffs, with continuous integration, delivery, and visibility throughout.
Once a baseline solution is implemented, it’s enhanced on an ongoing basis through the integration of advanced capabilities: from R&D to minimum viable product and through the pipeline.
The Machine Learning Development Platform
The Machine Learning Development Platform (MLDP) puts MLOps principles into action. The MLDP is a flexible solution that addresses the challenges of operationalizing and scaling ML. It achieves this by automating the AI development and deployment process, from data ingestion and processing through deployment and continuous monitoring.
With the MLOps approach and the MLDP solution, ML teams can:
- Ingest and fuse large volumes of data, in varying formats and quality levels, into the modeling process
- Push data and containerized models across the enterprise, including security boundaries
- Rapidly update and retrain models with the latest available data
- Provide model governance and version control, data lineage and enrichment as well and ongoing monitoring of model performance all to ensure transparency and auditability
- Customize models and policies to the mission and evolving environments
- Publish new models to meet changing conditions
The various “Ops” work together on the MLDP as follows:
- DataOps compile new data from all edge devices and format them for model training. This step involves ingestion, processing, and storage. It establishes how the data will be accessed through catalogs, searches, and visualizations as well as governance related to security, privacy, data provenance, and data lineage.
- DevSecOps provide the code and infrastructure, including the core workflow that supports agile development. It also has the ability to deploy solutions across security domains and to function within a cloud environment.
- MLOps provide the model itself. The MLDP integrates DataOps outcomes and uses model data to tune weights and hyperparameters. Through an automated push to DevSecOps, the updated model is integrated with the application and customized based on the target application platform.
Getting a MLOps Project Started
For ML success, the right processes are just as important as the right tools.
To begin putting MLOps and the MLDP into action, your organization should:
- Identify an initial project by examining your current ML engineering & development process and tools, available data, and what would make a compelling use case, based on clarity of application and estimated ROI. Explore the use of tools that provide MLOps, containerization, and monitoring for the overall MLDP pipeline.
- Select your MLDP configuration, including a hosting location for the MLOps pipeline and production location/system and standard baseline configurations that best meet your use case requirements.
- Customize your baseline configuration through sprint planning, updates to automated scripts, and test deployments. Document and publish updates to your MLDP repository.
- Deploy and test the model, integrating it with data pipelines and current AI/ML models and configuring monitoring metrics for rules and alerts.
- Elevate the MLDP to full use through monitoring performance, continuously evaluating ways to enhance pipeline capabilities, and linking pipeline models into the model repository for scaling and replication.
- Integrate ethics into the MLDP, including a framework for quantifying adherence to principles. Integrate these metrics into the automated model and link metrics to the required assurance level.
- Grow your ML expertise organically through cross-training software developers, data scientists, and ML engineers.
By enabling organizations to manage models throughout the lifecycle, track and monitor performance, and implement security at every phase, MLOps and the MLDP drive ML from development to operational use. With a robust MLOps program, organizations can scale and standardize AI/ML to deliver trusted, real-time decisions that power product development, solve problems, and inform action.
[To share your insights with us, please write to firstname.lastname@example.org]