Detecting, Addressing and Debunking the Hidden AI Biases
Those of us in the technology sector are quick to celebrate advances in artificial intelligence (AI). From applications that can detect fraudulent transactions to those that can predict disease progression, AI is poised to optimize business automation processes and drive powerful research breakthroughs. So why is it that a 2021 survey by the Pew Research Center found that a greater share of Americans reported being “more concerned than excited” about AI’s increasing presence, and how can our community address this?
AI ML insights:
Aside from the proverbial fear of job loss (and I, Robot uprisings), public mistrust of AI stems from the field’s lack of ethical oversight. Professor Frank Rudzicz of the University of Toronto and the Vector Institute outlines the three biggest ethical concerns for AI today: First, what AI is actually used for may not be what we intended; second, access to AI is not always equal; and lastly, AI can very easily adopt and amplify unconscious human biases in our society.
On the last front, Rudzicz calls technologists to action: “We need to figure out ways to limit that effect, to make sure that the data that we provide to the AI is as free of bias as possible.” As AI becomes further integrated into high-stakes decisions that directly impact us all – particularly in more sensitive areas like financial services and healthcare – that call only becomes more urgent.
Because, AI biases can creep into the machine learning pipeline at every stage and potentially produce extremely harmful outcomes, mitigation strategies need to cover a lot of ground. To complement them, we can turn to the technology itself — in other words, develop AI that is capable of detecting the biases we fail to catch, or even preventing them altogether.
Serious as a heart attack
Here’s a hypothetical but all-too-likely scenario: A man and a woman, both in their late forties, have similar activity levels, nutritional habits, and family histories. They are also both about to have a heart attack in the next 72 hours.
They each use a consumer medical app on their smartphones linked to their smartwatches to monitor their health. Both the man and the woman check symptom boxes for chest pressure, fatigue, difficulty breathing, and excessive sweating. The app displays a message to the woman: You may be experiencing anxiety; talk to your doctor. The app flashes a warning to the man: You may be at high risk for a heart attack. If you are also experiencing any of the symptoms below, immediately call 911.
More on AI Biases: Addressing AI Bias in Online Identity Verification with 5 Critical Questions
This is what bias in AI often looks like.
It may surprise you to learn that it wasn’t necessarily “incorrect” data that led to the incorrect prediction for the woman, but rather historically biased datasets. There is a deep-seated gender gap in cardiology research data that has long resulted in fundamental flaws in the care for women with heart disease. From the start, then, our scenario’s app was limited by the gender bias lurking in the datasets used to train its models.
While this represents one of the most well-known sources of AI bias, not all of them are rooted in the training datasets. In reality, there are many more entry points to consider, and most are not so easily recognized. “A Survey on Bias and Fairness in Machine Learning” defined as many as 19 different types that are introduced across three different interaction loops (data to algorithm, algorithm to user, and user to data).
It is still early in the evolution of AI, but this is troubling news for business stakeholders and society alike. In fact, Gartner has predicted that through 2022, 85% of AI projects will deliver erroneous outcomes due to implicit biases.
We have the technology
Fortunately, many experts are sharing game plans that organizations can follow to reduce AI biases. Mitigation strategies include expanding data collection pools, introducing synthetic datasets, investing in third-party audits, and increasing the diversity of AI teams.
These are all solid steps on the path toward more ethical AI. But we can and should go further by leveraging the very technology that we are diagnosing with the problem to help us find new solutions.
In her 2019 MIT Technology Review piece “This is how AI bias really happens — and why it’s so hard to fix,” Karen Hao characterizes one of the more difficult instances of bias as “unknown unknowns”: “The introduction of bias isn’t always obvious… you may not realize the downstream impacts of your data and choices until much later. Once you do, it’s hard to retroactively identify where that bias came from and then figure out how to get rid of it.”
How can we fix an issue that we don’t realize exists? In the case of unknown unknowns, we have to move beyond the strategies outlined above.
And this is where AI can conceivably help. Hao points to the development of “algorithms that help detect and mitigate hidden biases within training data or that mitigate the biases learned by the model regardless of the data quality.” Researchers at Penn State and Columbia University pursuing this route have designed a bias-detecting AI tool that can identify, in both human decision makers and AI systems, hidden discrimination relating to “protected attributes.”
Self-supervised learning (SSL) may offer another solution, this one in the form of predictive pretraining. SSL is an approach to machine learning that uses massive amounts of unlabeled data to train models, sidestepping the labels that are often rife with biases. It may still be too early to fully determine SSL’s debiasing potential, but Yann LeCun and Ishan Misra predict in their blog post that it may turn out to be “one of the most promising ways to… approximate a form of common sense in AI systems.”
It is encouraging that the road to ethical AI is paved with multiple bias mitigation strategies and these emerging technological approaches. As we learn more about how effective those new AI tools will be, I am reminded that their development also depends on the innovations occurring in software and hardware infrastructure. Industry organizations who value that symbiosis – such as ML Commons with its mission “to accelerate machine learning innovation and increase its positive impact on society” – will move us closer to AI that is deserving of public trust.