Growing up, I’ve always been fascinated by science-fiction films. From Blade Runner to The Terminator, these movies hauntingly illustrated futuristic worlds shaped by technological innovations that gave rise to advanced Machine Learning techniques and depicted astonishing examples of Artificial Intelligence (AI). But instead of doomsday scenarios with humanity cowering at the feet of our robot overlords, AI has emerged as one of the most significant forces behind the digital transformation of business. In fact, many believe AI has the potential to not only impact the corporate world but lead to groundbreaking applications which will have profound effects on every aspect of our daily lives.
From astronomy to zoology, AI is enabling people to redefine how information is being collected, integrated, and analyzed; leading to more informed insights and delivering better outcomes. These breathtaking strides in technology are being driven by advancements in Machine Learning; specifically, what is referred to as deep learning.
Read More: What Is OCR?
Deep learning is part of the broader field of Machine Learning that is concerned with giving computers the ability to learn without being programmed, and it’s led to beneficial developments in many areas, including
- Language Understanding – Chatbots are able to automate customer calls and even make appointments by understanding context and spoken language.
- Image/Video Understanding – Ability for machines to identify images or moving videos to make deterministic calls on potential threats.
- Audio Understanding – Enabling machines to comprehend human words to drive improved understanding of language.
Despite the incredible promise of AI, super smart folks like Stephen Hawking and Elon Musk still warn of the coming AI apocalypse. In fact, Elon Musk’s new company, Neuralink, aims to stop a Terminator-style attack by fusing man and AI through brain links.
But should we really be worried? Not yet, at least.
After all, Artificial Intelligence is, at its core, artificial. It will do its job based on what it is told, whether that information is accurate or not. And therein lies the biggest challenge with Artificial Intelligence today. Companies pursuing AI projects lack a strong foundation of clean, accurate and structured data; the type of ground truth machines need to learn from overtime.
Read More: Learning to Trust AI in Troubled Times
According to a recent research report by MIT Technology Review, insufficient quality of data was the second biggest obstacle to employing AI, narrowly behind a shortage of internal talent. What’s more, 85% of AI projects will “not deliver” for CIOs, that according to Gartner. The data challenge is at the core of the problem. “You can’t feed the algorithms if you don’t have data. Solid, clean data in large volumes, well-tagged and well organized is crucial,” that according to comments from the Chief Data Officer at the Department of Defense, Michael Conlin.
This makes me think back to one of my favorite sci-fi flicks, The Terminator. It’s famous for portraying a dystopian society with Artificially Intelligent robots hell-bent on the destruction of the human species, not to mention some catchy one-liners from future California governor, Arnold Schwarzenegger.
In the movie, the T-800 Model 101 Terminator, a highly-advanced robot with living tissue over a metal endoskeleton, is programmed to find and kill our hero, Sarah Connor. But the machine does not know which Sarah Connor to target. The only data it has is her name and the city she lives in. Not knowing exactly who the main target is, The Terminator must scroll through the phone book (remember those?), dispatching all the Sarah Connors on the list. Being that this is a 90-minute movie, The Terminator intercepts the intended Connor rather quickly and spends the rest of the movie blowing things up.
Read More: Big Data, Better Healthcare
So, as advanced and intelligent as The Terminator is, it essentially must guess which Sarah Connor to target because it lacks important basic, foundational training data – the information used to train a Machine Learning model. It’s actually not until the many sequels that the machine can identify its main target because it has already been fed the proper training data by learning and adjusting through its first failed and corrected attempts. (And there’s that whole issue of time travel that just gives me a headache.)
So, AI, whether it’s a homicidal cyborg or customer chatbot, needs the right data to make intelligent decisions. That data needs to be:
- Accurate – Correctly defined in a consistent matter in accordance with the expected data standards of a particular business model. (i.e. The right Sarah Connor’s full contact info)
- Complete – It should not have any gaps in the data from what was expected to be collected, and what was actually collected. (i.e. Sarah Connor’s complete personal history)
- Unique – Data that can stand alone and not be found in multiple formats and locations within your database. (i.e. one single source of truth to guide the mission in its CPU)
- Timely – A deep understanding of the data that is no longer useful based on timing. (i.e. Sarah Connor’s current place of residence)
If Skynet had given the Terminator the ground truth data on its intended target, it probably would’ve been much more successful. Of course, that would have meant there wouldn’t have been much of a movie to enjoy and the terrifying scenario of Judgement Day would have been the end-result.
There’s no question Artificial Intelligence has the potential to completely change the world as we know it. But there’s no need to worry about the machines taking over anytime soon. As we have seen, humans still play a decisive role in the development and training of Machine Learning models. And let’s not forget the data. No matter how sophisticated these Machine Learning models become, they will never live up to their full potential until the data is reliable enough for us to index, tag and label.