Stanford’s Alpaca is a Very Different Animal

These days, it not unusual to have something fascinating come out every week from the AI world. However, last week through was something akin to an explosion, there so many announcements and breakthroughs that it was hard to keep track of all of them. If you haven’t already seen, here are some of the major ones:

  1. The much-awaited GPT-4 was released by OpenAI. In addition to improvements over GPT-3.5 in areas such as reasoning and conciseness, it was revealed to have multimodal capabilities — though these have not yet been made available to the public. Microsoft also confirmed that an early version of GPT-4, fine-tuned for search, is already being used by Bing.
  2. Google announced the incorporation of generative AI into Google Workspaces.
  3. Google also unveiled PaLM API & Maker Suite, allowing developers to use build applications with generative AI capabilities. PaLM is a multimodal model capable of create images, video, and audio from prompts, as well as the GPT-style text generation we’re all familiar with.
  4. Microsoft also revealed plans to integrate generative AI capabilities into the Microsoft 365 product suite. Soon, users will be able to prompt MS Office to perform tasks such as creating PowerPoint presentations, autogenerating email replies, or conducting data analysis and more.
  5. Google-backed Anthropic launched Claude, a generative AI chatbot assistant they claim produces more reliable output than its competitors.
  6. Midjourney, a popular text-to-image app, released version 5. If you’re into AI art (yes, it’s a thing), you were probably eagerly awaiting this. If not, check out what people are creating with it here — it’s pretty cool.

But the development that caught my attention the most was Stanford’s Alpaca model, which was released a day before GPT-4 and likely didn’t receive the attention it deserved due to this and all the other announcements.

What is Alpaca?

Alpaca is essentially a instruction following language model that can run on a sufficiently powerful laptop and produce output almost as good as GPT 3.5!

Imagine ability to run ChatGPT on your laptop(need high-end one, but still), which can answer questions on your personal data. That is really exceptional!

To really understand what Alpaca and how it created, we should first take a look its foundation — Meta’s LLaMA model.

Starting with a small but powerful foundation

Meta released its LLaMa models last month with the intent of helping researchers who don’t have access to large amounts of infrastructure required to train Large Language Models (LLMs) these days. It is a foundational model that comes in four sizes (7B, 13B, 33B, and 65B parameters), which can be customized for different purposes, such as predicting protein structures, solving math problems, or generating creative text. According to Meta, many of these models, especially the largest — 65B parameter one, outperform GPT-3.5, which is much bigger in size at 175B parameters.

Foundational models are a category of machine learning models that serve as the basis for building a wide range of applications across various domains. These models, often large-scale and powered by deep learning techniques, are trained on massive amounts of diverse data to develop a broad understanding of language, context, and knowledge.

In some ways, they resemble the “primordial soup,” which, in the context of the origin of life, refers to the mixture of organic compounds that gave rise to the first living organisms on Earth through chemical reactions and natural processes. Similar to primordial soup, they provide a versatile base from which various AI applications and solutions can emerge. These models are trained on vast amounts of diverse data, allowing them to acquire a wide range of knowledge and language understanding. Although they may not be very useful directly, they can be adapted or fine-tuned for specific tasks or domains, giving rise to numerous AI applications.

Figure 1 (Courtesy: On the Opportunities and Risks of Foundation Models)

The most prominent foundational model of them all — GPT-3 and later versions — are not open-sourced. They are also large and require a significant amount of compute infrastructure to run. This is where LLaMa shines; by creating foundational models of different sizes that others can fine-tune for various tasks, Meta has made it much easier for researchers to make rapid advancements in their fields.

Using GPT 3.5 as the Trainer

However, it is not the smaller size itself that is interesting about Alpaca; rather, it’s the way it was trained and how quickly it was done.

The key challenge is to make the foundational model (LLaMa) capable of following human instructions. For this, researchers leveraged the Self-Instruct framework that helps language models improve their ability to follow natural language instructions. First, they started with 175 handwritten instruction sets, which were then fed into GPT-3.5 (text-davinci-003) to create a larger 52K sample set. This set was then used to train the foundational model, which was subsequently fine-tuned through supervised learning. Comparing the outputs, researchers found that, surprisingly, these two models have very similar performance.

This approach significantly reduces the manual effort required to create a useful, instruction-following model from a foundational model using a form of knowledge distillation.

Figure 1 (Courtesy: Alpaca: A Strong, Replicable Instruction-Following Model)

Think of a foundational model (LLaMA) as a novice chef who has a vast knowledge of ingredients and basic cooking techniques but needs guidance to create specialized dishes. The instruction following model is like the chef learning to follow specific recipes, refining their skills to become a more accomplished cook.

In this analogy, LLaMA represents the novice chef with a solid foundation, while GPT-3.5 serves as the master chef who has honed their craft over time and can create exquisite dishes with precision. By tapping into GPT-3.5’s expertise, the developers of Alpaca were able to guide the novice LLaMA model, helping it learn more advanced techniques and methods.

As the novice chef (LLaMA) receives guidance from the master chef (GPT-3.5), they gradually become more skilled, refining their abilities and becoming adept at creating high-quality dishes. This is analogous to Alpaca, which, after being trained using GPT-3.5’s knowledge, can perform tasks with a level of proficiency similar to that of the larger model but with a fraction of the cost and resource requirements.

Smaller, Cheaper, and Almost as Good as GPT-3.5

Creation of text-davinci-003, essentially an instruction-following model created from GPT foundational models took significantly more effort, infrastructure, and many months to train. In contrast, Alpaca was trained in matter of days, by handful of people at significantly low cost and achieves almost the same performance.

It remains to be seen what kind of performance gains a model created based on the largest of the LLaMa models using a similar approach can achieve.

Note that Alpaca is intended only for academic research and any commercial use is prohibited. Primarily because LLaMA, has a non-commercial license, and text-davinci-003 terms of use prohibit developing models that compete with OpenAI. But also because adequate safety measure are not in place yet.

What are the implications?

If you look closely, you can speculate that there is a sort of “acceleration of knowledge transfer” going on. Prior to the advent of high-quality instruction-following models like GPT-3.5, it was not possible to generate large amounts of instruction training sets in a short span of time from very small seed sets. Also, arguably, a 52K training set might not be sufficient to fine-tune a model to achieve high-quality instruction-following tasks until it is as good as a model like LLaMa.

Another way to think about this is a kind of “impedance matching” akin to electrical circuits, with GPT-3.5 helping researchers match the impedance of the foundational model to produce the desired outcome. As AI models become more advanced, they can facilitate faster knowledge transfer from humans to machines, enabling them to solve increasingly complex problems. This impedance matching process allows researchers to fine-tune the foundational models, optimizing their performance and ensuring that they align with specific tasks or domains.

Therefore, one can argue that as models become better, it is becoming significantly easier to create even better new models, or in other words, easier to transfer “knowledge and intelligence” from humans to machines and use them to solve complex problems.

However, the development and adoption of models like Alpaca raise important opportunities and risks. On one hand, they bring up important questions about the future of AI, particularly in terms of knowledge sharing and competition. As smaller companies gain access to state-of-the-art models through the use of APIs and more efficient training techniques, larger organizations may find it increasingly difficult to maintain a competitive edge.

Of course, this also raises questions about the accelerating rate of progress in AI systems and the impact on society at large, a highly complex topic beyond the scope of this blog post.

How all this will play out remains to be seen.


  1. Alpaca: A Strong, Replicable Instruction-Following Model
  2. Introducing LLaMA: A foundational, 65-billion-parameter large language model
  3. On the Opportunities and Risks of Foundation Models
  4. How does GPT Obtain its Ability
  5. GPT-4 Technical Paper


Related articles

Los Creadores de Contenido en Google

Title: Google Empowers Web Editors with New Feature Introduction: Google has...

Interview: Lenovo’s Role in Democratizing AI

Leveraging Generative AI: Lenovo's Journey Towards Accessibility and Security Generative...