You probably haven’t heard of LLaMA 3, Meta Inc.’s new text AI (“LLM”). But it might just change the world.
On Thursday, April 18, Meta released the third generation of their LLaMA (Large Language Model Meta AI) LLM line. The latest and “most capable open LLM to date”, LLaMA 3 boasts several important scores on tests for AIs, and although many of those benchmarks may not give valuable data, they are accepted as an industry standard. But why is this model special, and what makes it “open”?
LLaMA 3 is a significant advancement in the AI industry because it is 40% more powerful than previous open source models. It does this by breaking down information into tiny chunks, like words in a sentence. LLaMA 3 can understand and remember these chunks better, allowing it to guess what comes next in text, similar to how you might guess the next word in a sentence you’re reading.
According to an industry test, the most powerful AI in the world is Google Deepmind’s Gemini Ultra AI. However, this AI is not available for everyone to use. Instead, one has to pay Google for access to its powerful capabilities. The AI can outperform human experts on tests – meaning it is arguably “smarter” than humans. These new LLaMA models are not as powerful as Gemini, but they can be run on nearly any computer and anyone can access them.
“We believe these are the best open source models of their class, period” said Meta Inc. in a blog post about the release of the new models. “In support of our longstanding open approach, we’re putting Llama 3 in the hands of the community.”
Some researchers argue LLaMA 3 is not totally open because Meta failed to share everything they used to train it. This is like giving someone a really good study guide (the weights) but not the textbook (training data) or the teacher’s notes (training code).This makes it hard for others to rebuild LLaMA 3 from scratch.
Models released for users to run on their own computers (like LLaMA) often have decreased power compared to “cloud” models, said IBM watsonx Subject Matter Expert William Rondon. Meta released two variants of their LLaMA model – 8 and 70 billion parameters. The number of parameters in a model increases its abilities – like the number of neurons in a brain – and learns different things. This means that, as AI has more and more parameters, it becomes more capable in tasks like complex reasoning (e.g. calculus) and multi-language understanding (e.g. taking in English and outputting Spanish). The current State-of-the-Art (SOTA) model, Google Gemini Ultra, has over 1.7 trillion parameters – meaning the larger of the new LLaMA models has about 4% of the parameters of the SOTA.
However, model performance is not always predictable – more parameters do not have to lead to a more capable model. Model architecture and training data also play large roles – but Meta plans to release a 400-billion parameter model. The model is still currently training but demonstrates comparable scores to SOTA on several benchmarks. When released, the 400B model will revolutionize the open-source AI community – and potentially AI as a whole.
“While these models are still training, our team is excited about how they’re trending,” said Meta in a blog post. “With the speed at which the generative AI space is moving, we believe an open approach is an important way to bring the ecosystem together[.]”
One problem with open-sourcing AI models is that it allows the models to be hacked. The models cannot install malware on the user’s computer, but they can still generate harmful content (e.g. explaining how to create a bioweapon). To prevent generation of dangerous content, Meta developed “prompt mitigation” techniques to block inappropriate requests of the model. Prompt mitigation is essential – nearly a decade ago, Microsoft released and then, 24 hours later, recalled a chatbot named “Tay” because it turned into a Nazi. The issue, of course, was humans asking the bot to generate inappropriate things.
The more open a security approach is, the more reliable it is – but the more easily it can be examined and worked around. Meta AI opted to release a “system-level” approach for their models – at several stages, AIs check content for disallowed items (e.g. sexual violence). While these AIs can be tricked, the nature of the AI-based filter approach means people cannot simply bypass a list of disallowed words.
Safety remains a paramount consideration in all new AI models, but Meta LLaMA 3 still has numerous benefits. Who knows? Maybe it could change the world.