AWS Announces Six New Amazon SageMaker Capabilities, Including the First Fully IDE for Machine Learning
Amazon SageMaker Model Monitor Detects Concept Drift to Discover When the Performance of a Model Running in Production Begins to Deviate from the Original Trained Model
Amazon Web Services, Inc. (AWS), an Amazon.com company, announced six new Amazon SageMaker capabilities, including Amazon SageMaker Studio, the first fully integrated development environment for machine learning, that makes it easier for developers to build, debug, train, deploy, monitor, and operate custom machine learning models. Today’s announcements give developers powerful new tools like elastic notebooks, experiment management, automatic model creation, debugging and profiling, and model drift detection, and wraps them in the first fully integrated development environment (IDE) for machine learning, Amazon SageMaker Studio.
“As tens of thousands of customers have used Amazon SageMaker to remove barriers to building, training, and deploying custom machine learning models, they’ve also encountered new challenges from operating at scale, and they’ve continued to provide feedback to AWS on their next set of challenges”
Amazon SageMaker is a fully managed service that removes the heavy lifting from each step of the machine learning process. Tens of thousands of customers utilize Amazon SageMaker to help accelerate their machine learning deployments, including ADP, AstraZeneca, Avis, Bayer, British Airways, Cerner, Convoy, Emirates NBD, Gallup, Georgia-Pacific, GoDaddy, Hearst, Intuit, LexisNexis, Los Angeles Clippers, NuData (a Mastercard Company), Panasonic Avionics, The Globe and Mail, and T-Mobile. Since launch, AWS has regularly added new capabilities to Amazon SageMaker, with more than 50 new capabilities delivered in the last year alone, including Amazon SageMaker Ground Truth to build highly accurate annotated training datasets, SageMaker RL to help developers use a powerful training technique called reinforcement learning, and SageMaker Neo which gives developers the ability to train an algorithm once and deploy on any hardware. These capabilities have helped many more developers build custom machine learning models. But just as barriers to machine learning adoption have been removed by Amazon SageMaker, customers’ desire to utilize machine learning at scale has only increased.
Amazon SageMaker makes a lot of the building block steps to developing great machine learning models much easier. But many times, building truly great models that evolve successfully as a business grows takes a lot of optimizations between these building blocks and requires visibility into what’s working or not and why. These challenges are not unique to machine learning, as the same is true of software development, generally. However, over the past few decades, lots of tools like IDEs that help with testing, debugging, deployment, monitoring, and profiling have been built to help with the challenges faced by software developers. But due to its relative immaturity, these same tools simply haven’t existed in machine learning – until now.
Today’s announcements include significant capabilities that make it much easier for customers to build, train, explain, inspect, monitor, debug, and run custom machine learning models:
- Machine learning IDE: Amazon SageMaker Studio pulls together all of the components used for machine learning in a single place. Just like an IDE, developers can view and organize their source code, dependencies, documentation, and other applications assets (e.g. images used for mobile apps) in Amazon SageMaker Studio. Today, there are a lot of components to machine learning workflows, many of which come with their own set of tools that exist separately today. The Amazon SageMaker Studio IDE provides a single interface for both all of the Amazon SageMaker capabilities announced today and the entire machine learning workflow. Amazon SageMaker Studio gives developers the ability to create project folders, organize notebooks and datasets, and discuss notebooks and results collaboratively. Amazon SageMaker Studio makes it simpler and faster to build, train, explain, inspect, monitor, debug, and run machine learning models from a single interface.
- Elastic notebooks: Amazon SageMaker Notebooks provides one-click Jupyter notebooks with elastic compute that can be spun up in seconds. Notebooks contain everything needed to run or recreate a machine learning workflow. Before today, to view or run a notebook, developers needed to spin up a compute instance in Amazon SageMaker to power the notebook. If they found out they needed more compute power they had to spin up a new instance, transfer the notebook, and shut down the old instance. And, because the notebook was coupled to the compute instance, and the notebook typically existed on a developer’s workstation, there was no easy way to share notebooks and iterate collaboratively. Amazon SageMaker Notebooks delivers elastic Jupyter notebooks, allowing developers to easily dial up or down the amount of compute powering the notebook (including GPU acceleration), with the changes taking place automatically in the background without interrupting the developer’s work. Developers no longer need to lose time shutting down the old instance and recreating all their work in a new instance. This makes it much faster to get started building a model. Amazon SageMaker Notebooks will also enable one click sharing of notebooks by automatically reproducing the specific environment and library dependencies. This will make it easier to build models collaboratively, since an engineer will be able to easily make their work available to other engineers for them to build on top of the existing work.
- Experiment management: Amazon SageMaker Experiments helps developers organize and track iterations to machine learning models. Machine learning typically entails several iterations aimed at isolating and measuring the incremental impact of changing specific inputs. Developers produce hundreds of artifacts such as models, training data, and parameter settings during these iterations. Today, they have to rely on cumbersome mechanisms like spreadsheets to track these experiments and manually sort through these artifacts to understand how they impact the experiments. Amazon SageMaker Experiments helps developers manage these iterations by automatically capturing the input parameters, configuration, and results, and stores them as ‘experiments’. Developers can browse active experiments, search for previous experiments by their characteristics, review previous experiments with their results, and compare experiment results visually. And, Amazon SageMaker Experimentsalso preserves the full lineage of the experiments, so if a model begins to deviate from its intended outcome, developers can go back in time and inspect its artifacts. Amazon SageMaker Experimentsmakes it much easier for developers to iterate and develop high-quality models more quickly.
- Debugging and profiling: Amazon SageMaker Debugger allows developers to debug and profile model training to improve accuracy, reduce training times, and facilitate a greater understanding of machine learning models. Today, the training process is largely opaque, training times can be long and hard to optimize, and the ‘black box’ effect makes it hard to interpret and explain models. With Amazon SageMaker Debugger, models trained in Amazon SageMaker automatically emit key metrics that are collected and can be reviewed in Amazon SageMaker Studio or via Amazon SageMaker Debugger’s API. These metrics provide real-time feedback on training accuracy and performance. When training problems are detected, Amazon SageMaker Debugger provides warnings and remediation advice. Amazon SageMaker Debugger also helps developers interpret how a model is working, representing an early step towards the explainability of neural networks.
- Automatic Model Building: Amazon SageMaker Autopilot provides the industry’s first automated machine learning capability that does not require developers to give up control and visibility into their models. Today’s approaches to automated machine learning do an adequate job of creating an initial model, but they have no data available for developers on how the model was created or what’s in it. So, if the model is mediocre and developers want to evolve it, they’re out of luck. Also, today’s automatic machine learning services only give customers one simple model. Sometimes customers are willing to make trade-offs, such as sacrificing a little accuracy in a version of the model in exchange for a variant that makes lower latency predictions, but given that customers only have one model to choose from, there are no such options. Amazon SageMaker Autopilot automatically inspects raw data, applies feature processors, picks the best set of algorithms, trains multiple models, tunes them, tracks their performance, and then ranks the models based on performance – all with just a few clicks. The result is a recommendation for the best-performing model that customers can deploy, but at a fraction of the time and effort normally required to train it, and with full visibility into how the model was created and what’s in it. Amazon SageMaker Autopilot can be used by people who lack experience with machine learning to easily produce a model based on data alone, or it can be used by experienced developers to quickly develop a baseline model on which teams can further iterate. Amazon SageMaker Autopilot also gives developers a range of up to 50 different models that can be inspected in Amazon SageMaker Studio, so developers can choose the best model for their use case and have options to consider depending on which factor for which they choose to optimize.
- Concept drift detection: Amazon SageMaker Model Monitor allows developers to detect and remediate concept drift. Today, one of the big factors that can affect the accuracy of models deployed in production is if the data being used to generate predictions starts to differ from that used to train the model (e.g. changing economic conditions driving new interest rates affecting home purchasing predictions, changing seasons with different temperature, humidity, and air pressure impacting confidence in predicted equipment maintenance schedules, etc.). If the data starts to differ, it can lead to something called concept drift, whereby the patterns the model uses to make predictions no longer apply. Amazon SageMaker Model Monitor automatically detects concept drift in deployed models. Amazon SageMaker Model Monitor creates a set of baseline statistics about a model during training and compares the data used to make predictions against the training baseline. Amazon SageMaker Model Monitor alerts developers when drift is detected and helps them visually identify the root cause. Developers can use Amazon SageMaker Model Monitor’s out of the box features to detect drift right away, or they can write their own rules for Amazon SageMaker Model Monitor to monitor. Amazon SageMaker Model Monitor makes it easier for developers to adjust the training data or algorithm to accommodate concept drift.
“As tens of thousands of customers have used Amazon SageMaker to remove barriers to building, training, and deploying custom machine learning models, they’ve also encountered new challenges from operating at scale, and they’ve continued to provide feedback to AWS on their next set of challenges,” said Swami Sivasubramanian, Vice President, Amazon Machine Learning, AWS. “Today, we are announcing a set of tools that make it much easier for developers to build, train, explain, inspect, monitor, debug, and run custom machine learning models. Many of these concepts have been known and used by software developers to build, test, and maintain software for many years; however, they were not available for developers to build machine learning models. Today, with these launches, we are bringing these concepts to machine learning developers for the very first time.”
Read More: Tracking Google Cloud at RSNA 2019
Autodesk is a global leader in software for customers in the architecture, engineering/construction, product design, and manufacturing industries. Autodesk’s software offerings include AutoCAD (drafting software) and BIM 360 (cloud platform for project delivery and construction document management). “At Autodesk, we leverage machine learning to enhance our design and manufacturing solutions to enable greater degrees of creative freedom for our customers. Generative design technology can produce hundreds of optimized solutions that meet design criteria,” said Alexander Carlson, Machine Learning Engineer, Autodesk. “Using machine learning, we developed a new filter that identifies and groups outcomes with similar visual characteristics to make it easier to find the best options. This Visual Similarity filter will always be adapting to what it is observing, making it easier and more efficient to find that perfect design. Amazon SageMaker Debugger allows us to iterate on this model much more efficiently by helping close the feedback loop, saving valuable data scientist time, and cutting training hours by more than 75%.“
Change Healthcare is a leading independent healthcare technology company that provides data and analytics-driven solutions to improve clinical, financial, and patient engagement outcomes in the U.S. healthcare system. “At Change Healthcare, we are continuously working with our healthcare providers to remove inefficiencies from the processing of healthcare claims. We often receive claim forms from our healthcare providers which have unreadable labels and fixing these forms manually adds time and cost to the claim settlement process. We have developed a multi-layer deep learning model that superimposes labels from a good form into unreadable forms,” said Jayant Thomas, Senior Director, AI Engineering, Change Healthcare. “Amazon SageMaker Debugger helped us improve the accuracy of the model with rapid iterations which helped us achieve our release milestone. Additionally, SageMaker Debugger is helping us gain deeper insights on tensors, achieve resilient model training, assist in detecting inconsistencies in real time using rule hooks, and tune the model parameters for better accuracy.”
INVISTA is a world leading integrated fiber, resin, and intermediates company. “The new services within Amazon SageMaker are reaping powerful benefits for us at INVISTA. With Amazon SageMaker Studio, we’re now able to co-locate data science tasks. Instead of having to manage many separate resources, our team can easily continue to work in a path with little friction. This allows us to save time managing infrastructure and repositories and helps us reduce the time to deploy algorithms and analytics projects into production,” said Tanner Gonzalez, Analytics and Cloud Leader, INVISTA. “Amazon SageMaker Experiments helps us with model tracking. Before, we would track and save model artifacts in various places, but we wouldn’t have visibility across experiments and we’d often loose information. With SageMaker Experiments, we now have any easy interface to manage experiments, get a broader scope of projects, and add new models, metrics, and performance in a structured way. All of this allows us to accelerate data science value for INVISTA.”
SyntheticGestalt is an applied machine learning company that develops models, software, and intelligent agents for research automation in the pharmaceutical and other life-sciences industries. “We train our drug discovery models and synthetic biology simulation models with Amazon SageMaker, and the new features help us systematically manage and evaluate our experiment results. In order to gain insight into the performance of experiments, our researchers must maintain consistent experiment settings and model results,” Kotaro Kamiya, CTO, SyntheticGestalt Ltd. “With the latest launches within Amazon SageMaker, including features like Amazon SageMaker Studio and Amazon SageMaker Experiments, we can determine the best experiment settings 2x faster, which ultimately accelerates our ability to produce life-changing candidate molecules. SageMaker helps our researchers easily compare thousands of experiment settings; they are able to do with a single step what previously consumed hours of our researchers’ time. Whereas previously, we could only compare 100 experiment settings to one another, Amazon SageMaker Experiments takes away that constraint entirely, so we can focus on experiment design without limitations.”