LLaMA has rapidly risen to prominence in the open-source world in the past month.
The AI research team at Meta has developed a solid reputation for publishing their models as open-source, the most recent of which is LLaMA, with the model’s weights made available to academics and researchers as needed. Nevertheless, one of these parties published the code’s breach on GitHub, giving programmers everyone free access to their initial GPT-level LLM.
Since then, the developer community has had a field day with this model, optimising it to work on the lowest-powered devices, enhancing the model’s capabilities, and even leveraging it to generate some new use cases for LLMs. The largest multiplier for AI research is the open-source community, and developers are the driving force behind it.
enhancing the model
Beginning LLM fans discovered that the 7 billion parameter version of the model required more than 16GB of VRAM when LLaMA was first released. They soon discovered techniques to reduce the model’s memory requirements, though. The model was rewritten in C++ as part of a community initiative called LLaMA.cpp as the initial step in optimisation.
Specific use-cases
Programmers and developers started to recognise the applications of this LLM when the model was made open-sourced and researchers started to make use of Alpaca. Although it got off to a sluggish start with a developer utilising Alpaca to build a Homer Simpson bot, the concept quickly saw a wide range of practical uses.
Any member of the community might adjust the model using their own text thanks to a straightforward WebUI made by user “LXE” on GitHub. Similar to this, user “Sahil280114” also developed CodeAlpaca, a refined Alpaca code generating model. Due to LLaMA’s open-source nature, Llama Index, a project to link LLMs with external data, also switched from utilising GPT to LLaMA.
In order to further lower the entrance barrier for LLMs, Dalai was introduced as a simple way to have both Alpaca and LLaMA functioning on any platform with just a command. The foundation of Alpaca was used to create a new model called GPT4All. This increased the power of LLaMA even more because it was trained on almost 800,000 GPT 3.5 generations. The use cases never stopped coming in.
By using reinforcement learning and human feedback to train LLaMA, Colossal-AI developed a ChatGPT substitute. The community then developed Llamahub to keep track of all the different ways that people can interact with the model. The best part is that all of this happened within a month of the model’s publication, demonstrating the open-source community’s actual strength.
Open-source accomplishes the task
The community not only developed and enhanced the model that Meta published, but it also produced a wide range of application cases, all from a single LLM. Even though LLMs are all the rage at the moment, they are just one model type in the vast field of AI. Stable Diffusion was a model that, like LLaMA, attracted users, a community, and a number of offshoots.
While the rise of Stable Diffusion in the open-source community deserves its own article, suffice it to say that the model is the preferred choice for developers when it comes to picture production. The Stable Diffusion GitHub page currently has over 7,600 forks, which speaks much about the influence this diffusion approach has had on the open-source world.
The cost and difficulty of training larger models increases, which concentrates the power of LLMs in major technology firms like OpenAI, Microsoft, Google, and Meta. Models that are open-sourced provide communities more authority to create businesses based on these potent models, eventually establishing the foundation for a free AI society.