An Overview of How Prompt Tuning Is Leveraging Models to Perform Downstream Tasks
Foundation models have been instrumental in transforming the landscape of AI enterprise applications, and their rapid increase is setting the base for the next wave. Foundations models are large artificial models that are pre-trained (in most cases through self-supervising learning) on ridiculously huge amounts of unlabeled data. And because of this, these models can be customized for different things ranging from fraud detection in financial documents to analysis of legal contracts.
Fine-tuning was considered the ideal solution to redeploy a pre-trained model to perform specialized tasks. You were used to gathering and labeling examples of the target task to fine-tune the model instead of training a whole new one from the scratch. But now, a simpler, viable, and energy-efficient solution has emerged – prompt tuning.
What is Prompt Tuning?
Let’s start from the basics – what is prompt tuning? Prompt tuning is the process of feeding front-end prompts into your AI model in the context of a specific task. These prompts could be anything – just some extra words introduced by a human or AI-generated number introduced in the embedding layers. Prompt tuning is used to guide a model toward a particular prediction. With prompt tuning, organizations with limited data can tailor a model to perform a slick task while eliminating the requirement to update the model billions or weights.
Recommended: Virtual Cards for Your Business – How They’re Beneficial For Security
Did you know that when you are redeploying an AI model minus any retraining, you can save thousands by eliminating computing as well as energy by at least 1000 times?
IBM’s David Cox, head of Exploratory AI Research and co-director of the MIT-IBM Watson AI Lab explained that with the help of prompt-tuning, you can build a powerful model to suit specific needs. Besides, you also get a chance to explore and experiment faster.
The journey of prompt-tuning began with large language models and gradually expanded to foundation models like transformers that take care of sequential data types, including audio and video. Prompts can mean a number of things like streams of speech, a still image or video, or snippets of text.
Prompt for Specialized Tasks
Way before prompt-tuning became a reality, there was another term called prompt engineering which meant prompts that were designed by hand.
Let’s consider the example of a language model for translation tasks. You feed details including information about the target task. You can say – “Translate English to French.” When you prompt “cheese”, the model then delivers its prediction – “fromage.” This manual prompt primes the model to retrieve from its memory banks other words in French. If the task is difficult enough, dozens of prompts might be needed.
- Prompt engineering came into force when OpenAi’s ambitious GPT (Generative Pretrained Transformer), a language model almost 10 times bigger than any of its predecessors, was released.
- OpenAI researchers revealed that GPT’s successor, GPT-3 successor, at 175 billion parameters, enabled it to perform specialized tasks with only a few words introduced at inference time. In this case, where there was no retraining involved, GPT-3 performed equally well as a model fine-tuned on labeled data.
The emergence of Prefix Tuning
And soon, the hand-crafted prompts saw their way out of the system and were replaced by superior AI-designed prompts which comprised strings of numbers. A year later, Google researchers formally introduced the Ai-designed “soft” prompts which performed way better than the human-engineered “hard” prompts.
Recommended: Evergreen Enterprises Need Evergreen Operations
While prompt tuning was still under review, Stanford researchers announced prefix-tuning, which was another automated prompt-design method that empowered the model to consecutively learn tasks. Prefix-tuning was a combination of soft prompts and prompts that are fed into the layers of the deep learning model to enhance flexibility. Though prompt tuning is considered more efficient, the good part about both techniques is that you can freeze the model and do away with the expensive retraining.
AI-Designed Soft Prompts – What Works and What Doesn’t
AI-designed soft prompts are unrecognizable to the human eye. Each soft prompt is made of a string of numbers or an embedding that filters out knowledge from the larger model. When the prompt is of a high level or task-specific, it is like an alternative for additional training data. Recently, researchers recently predicted that a prompt that is a good language classifier is equivalent to hundreds to thousands of extra data points.
One disadvantage of prompt tuning is that it’s not adept in interpretability. Though the AI has the potential to discover prompts that are optimized it can come out with an explanation behind why it chose those specific embeddings.
Panda explained that “You’re learning the prompts but there’s very little visibility into how the model is helping,” said Panda. “It’s still a mystery.”
New Apps for Prompt-Tuning
From drug and materials discovery to car manuals, foundation models are exploring new avenues and finding new enterprise applications. Prompt tuning on the other is matching the pace and slowly evolving with them.
Another feature of foundation models is their ability to multitask. Sometimes, foundation models need to pivot quickly, like identifying negative comments and answering customer questions. Researchers are coming up with smart solutions. Instead of designing a unique prompt for each task, they are finding ways to create universal prompts that can be recycled easily.
- In an upcoming paper at the International Conference on Learning Representations (ICLR), Panda and his team presented their Multi-task Prompt Tuning (MPT) method which not only outperformed other methods but also did better than models fine-tuned on task-specific data.
- Panda explained that MPT empowers you to save money as you can customize that model below $100 instead of spending thousands to just retrain a 2-billion parameter model for a specific task.
Another area that is a work in progress is the ability to find prompts on the fly given that an AI model is constantly learning new tasks and concepts. When it comes to dealing with new knowledge, it is a natural assumption that the model would require an update on the new data. In some cases, catastrophic forgetting comes into effect. This team indicates that the old knowledge is overwritten.
Recommended: For An Effortless Consumer Experience -Bain & Company Signs Service Alliance With OpenAI
Methods to Fight Biases
Prompt tuning shines as a tool to mitigate algorithmic bias. As most AI models are trained on real-world data, assumedly, they absorb society’s biases, and this can result in decisions involving inequities in everything.
- Recently, IBM researchers presented a pair of papers at the 2022 NeurIPS conference to combat race and gender bias with the help of AI-designed prompts in large language and vision models.
- One of the methods adopted by the researcher, FairIJ, finds out the most biased data points in the model’s training set and enables it to set them aside via prompts. In the case of salary prediction, a model tuned with FairIJ delivered more accurate, less biased results than most of the top bias-mitigation methods.
- In another method called FairReprogram, the researcher gives the equivalent of gender-sensitivity training via prompts to an AI trained in beauty magazines.
- In another example involving a classifier that incorrectly learned to associate only women with blonde hair as “women,” to reorient this, IBM researchers decided to add an AI-designed border of black pixels to a woman’s photo with brown hair. They found that pixels did the trick of expanding the model’s notion of women to include those with brown hair.
Final thoughts
Prompt-tuning is like a whole package that not only dramatically reduces the cost of tailoring large models to new applications, said but also reorients or corrects the model’s behavior. With prompt tuning, organizations can adapt their model to specialized tasks faster and more sustainably.
[To share your insights with us, please write to sghosh@martechseries.com].
Comments are closed.