Is Generative AI’s Hallucination Problem Fixable?
What does the future look like for generative AI? If hallucinations persist, it’s a complicated question.
Conversations about the transformative possibilities of generative AI tools invariably come with a cautionary note about hallucinations, i.e., the tendency for AI tools to conjure fabricated information seemingly out of nowhere. AI experts have differing opinions about how long it will take for the industry to get hallucinations under control — or if it’s even possible to eliminate them.
In any case, AI hallucinations will continue to pose problems for the foreseeable future. That means businesses must learn to spot and work around these hallucinations if they want to leverage the potential benefits of generative AI tools in their daily operations. And, it begins with understanding why AI hallucinations occur and what employees can do to limit their frequency and reach.
The Origins and Implications of Generative AI Hallucinations
In May 2023, an attorney representing an injured airline passenger submitted a collection of cases as part of a legal brief.
The problem? Six of the cases didn’t exist.
During the research process, the lawyer used ChatGPT, which hallucinated the existence of certain cases.
How does an error like this happen?
While AI experts don’t fully understand what causes AI hallucinations, unfocused, massive sets of training data — coupled with the complexity of the generative process itself — is a primary suspect.
Essentially, large language models (LLMs) like ChatGPT are trained on a large set of sometimes outdated training data. Because most LLMs can only draw conclusions based on the user patterns displayed within the training data, unfamiliar situations can pressure the tools into making false or misleading claims. The more gaps or inconsistencies in the training data, the higher the likelihood of a hallucination.
When certain LLMs pull from sources outside of their initial training data, it creates a new set of challenges. For example, consider the amount of disinformation, parody, and bias that exists on the internet. Even humans struggle to determine what’s real and what’s fake (in part thanks to AI tools themselves). LLMs must navigate this labyrinth of information to produce an accurate output — without the familiarity of internet culture and jargon that humans possess.
This complexity directly contributes to hallucinations, placing a clear limitation on generative AI’s usability in certain industries — as our previous example with the attorney illustrates. Other high-stakes industries like healthcare are experimenting with how generative AI can be used for administrative tasks. However, more substantive generative AI use cases remain out of reach until the industry can get a handle on the hallucination problem.
How to Work Around AI Hallucinations
While generative AI hallucinations may prove difficult to eradicate entirely, businesses can learn to minimize their frequency. But, it requires a concerted effort and industry-wide knowledge sharing.
Here are three tactics you can use right now to mitigate the impact of generative AI hallucinations in your work.
- Keep humans in the driver’s seat
Think of generative AI like GPS. You wouldn’t blindly follow GPS instructions if they told you to drive off of a cliff, and you should never take generative AI’s outputs as gospel or use them as the sole basis for decision-making.
Instead, treat generative AI as a supplemental tool and encourage your employees to double-check any information these tools produce. Emphasize that you’re not introducing these tools to replace employees, but to make their lives easier. Cultivating an environment in which employees share generative AI best practices with one another and remain aware of new developments provides a foundation for your organization to stay one step ahead of hallucinations.
- Prioritize prompt refinement
With generative AI, the more specific your prompts, the better.
For example, suppose you want to learn about the history of cloud computing. You log into ChatGPT and type, “Tell me about the history of cloud computing.” The model produces a massive wall of text that is overwhelming to sift through. This shouldn’t be a surprise. General or vague prompts often produce vague answers — and this ambiguity is where hallucinations can sneak in undetected.
A better approach is to narrow your prompts and include relevant details. To round out our cloud computing example, here are a few refined prompts that should reduce the likelihood of hallucinations:
- What were the three defining events that contributed to the rise of cloud computing? Keep your answer to two paragraphs or shorter.
- You are a historian recounting the origins of cloud computing technology. In your account, exclude any mentions of modern cloud computing.
- Tell me about the history of cloud computing — specifically about the role DARPA played in the technology’s evolution.
This level of specificity and instruction will help you receive much more detailed, focused responses and make it easier to spot potential hallucinations when they occur.
- Consider a purpose-built LLM
While tools like ChatGPT and Bard have obvious value, your business could potentially achieve stronger results from building your own purpose-built LLM.
A purpose-built LLM provides a narrow focus, using a smaller set of training data to deliver tailored responses. The biggest advantage of this model is the ability to incorporate your organization’s own internal data into a controlled dataset and solve problems unique to your business and customers. This level of control limits the likelihood of hallucinations — as long as you keep the scope of your prompts narrow.
Ultimately, I think we’ll see these more personalized applications of generative AI grow in popularity for businesses. They may not attract the media attention of a ChatGPT, but these models have more practical applications for organizations without the heightened risk of hallucinations that come with larger tools.
The Unclear Future of Generative AI Hallucinations
There’s no way around it: Generative AI hallucinations will continue to be a problem, especially for the largest, most ambitious LLM projects.
Though we expect the hallucination problem to course correct in the years ahead, your organization can’t wait idly for that day to arrive. To reap the benefits of generative AI now, you need to understand how to prevent these hallucinations and flag them when they occur, whether they pop up in ChatGPT or your own purpose-built LLM.
[To share your insights with us, please write to firstname.lastname@example.org]