Meta’s Llama Arrives, Cleverly Reminds Its Presence in the AI Community
This year can be summed up as the reign of language models, courtesy, and tech behemoths like Google, Microsoft, and OpenAI. But, Meta’s sole focus has mostly been the advancement of science and it has constantly contributed towards the same.
Taking yet another step in strengthening the open science community and making it accessible to everyone, Meta publicly released LLaMA (Large Language Model Meta AI). Facebook parent, in its blog, described it as ‘a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI’.
Recommended: What Techniques Will Deliver for Measuring Attention in 2023?
How LLaMA Works and Why it’s Feasible to Train?
Like any other large language model, LLaMA also works by using a sequence of words as an input and predicts the next word to recursively generate text. ‘To train our model, we chose a text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets.’
The biggest advantage of models like LLaMA is that they are smaller and better performers and empower researchers and others in the community, especially those who do not have access, to study these models. Meta’s earlier released LLM-based chatbots – BlenderBot-3 and Galactica were soon taken down as they were giving incorrect results.
It is feasible to train smaller foundation models like LLaMA because they require ‘far less computing power and resources to test new approaches, validate others’ work, and explore new use cases’.
Recommended: Refining Search and Chat – Microsoft’s New AI-powered Bing and Edge are the New Copilots for the Web
Typically, foundation models train on a large set of unlabeled data, making it easier to fine-tune them on different tasks. Meta emphasizes that the company was making LLaMA available at several sizes (7B, 13B, 33B, and 65B parameters). They have also shared a LLaMA model card with details like how LLaMA was built in keeping with the approach to Responsible AI practices.
With LLaMA’s release, many are inkling towards the most obvious question – is LLaMA similar to ChatGPT?
But the answer is a clear no. Meta’s LlaMA is not even remotely close to Open AI’s ChatGPT or Microsoft’s Bing. It does not interact like humans or with humans. It is merely a research tool, which, according to Meta, is aimed at “democratizing access in this important, fast-changing field.”
Meta released LLaMA on a non-commercial license focused on research use cases. It has granted access to groups like industry labs and NGOs. The brand strongly believes that the AI community, which comprises academic researchers, policymakers, civil society, and industry, should come together and develop clear guidelines around responsible AI and specifically large language models in particular.
The Shortcomings of LLM
In the past few months large language models, especially the natural language processing (NLP) systems with billions of parameters have unveiled a whole new world of capabilities like generating creative text, solving mathematical theorems, predicting protein structures, answering reading comprehension questions, and more. Clearly, their contribution to simplifying the lives of billions of people and brands is remarkable.
But, despite the innovations and advancements, access to full research remains limited simply because of the restricted access to the kind of resources that are required to not only train but also run such large models. This limitation has impacted a researcher’s ability to understand ‘how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation.’
Recommended: For An Effortless Consumer Experience: Bain & Company Signs Service Alliance With OpenAI
Final Thoughts
The public release of LLaMA will be a notable advancement in the development of AI language models. And Meta’s unwavering commitment to the open science community and enabling researchers to study under a non-commercial license will prevent its misuse.
Comments are closed.