But what really stands out to me is the extent to which Meta is throwing its doors open. It will allow the wider AI community to download the model and tweak it. This could help make it safer and more efficient. And crucially, it could demonstrate the benefits of transparency over secrecy when it comes to the inner workings of AI models. This could not be more timely, or more important.
Tech companies are rushing to release their AI models into the wild, and we’re seeing generative AI embedded in more and more products. But the most powerful models out there, such as OpenAI’s GPT-4, are tightly guarded by their creators. Developers and researchers pay to get limited access to such models through a website and don’t know the details of their inner workings.
This opacity could lead to problems down the line, as is highlighted in a new, non-peer-reviewed paper that caused some buzz last week. Researchers at Stanford University and UC Berkeley found that GPT-3.5 and GPT-4 performed worse at solving math problems, answering sensitive questions, generating code, and doing visual reasoning than they had a couple of months earlier.
These models’ lack of transparency makes it hard to say exactly why that might be, but regardless, the results should be taken with a pinch of salt, Princeton computer science professor Arvind Narayanan writes in his assessment. They are more likely caused by “quirks of the authors’ evaluation” than evidence that OpenAI made the models worse. He thinks the researchers failed to take into account that OpenAI has fine-tuned the models to perform better, and that has unintentionally caused some prompting techniques to stop working as they did in the past.
This has some serious implications. Companies that have built and optimized their products to work with a certain iteration of OpenAI’s models could “100%” see them suddenly glitch and break, says Sasha Luccioni, an AI researcher at startup Hugging Face. When OpenAI fine-tunes its models this way, products that have been built using very specific prompts, for example, might stop working in the way they did before. Closed models lack accountability, she adds. “If you have a product and you change something in the product, you’re supposed to tell your customers.”
An open model like LLaMA 2 will at least make it clear how the company has designed the model and what training techniques it has used. Unlike OpenAI, Meta has shared the entire recipe for LLaMA 2, including details on how it was trained, which hardware was used, how the data was annotated, and which techniques were used to mitigate harm. People doing research and building products on top of the model know exactly what they are working on, says Luccioni.
“Once you have access to the model, you can do all sorts of experiments to make sure that you get better performance or you get less bias, or whatever it is you’re looking for,” she says.
Ultimately, the open vs. closed debate around AI boils down to who calls the shots. With open models, users have more power and control. With closed models, you’re at the mercy of their creator.
Having a big company like Meta release such an open, transparent AI model feels like a potential turning point in the generative AI gold rush.