<aside> <img src="/icons/help-alternate_gray.svg" alt="/icons/help-alternate_gray.svg" width="40px" /> About This is a rough draft of a guide for non-experts - leaders, policy makers, educators or any users - who are trying to make decisions about use of AI based on how Large Language Models are trained. I tried to be as descriptive as possible without too much technical language. Feel free to contact me with any questions or comments. [email protected].

</aside>

TLDR;

(Claude generated) AI model "training" involves five distinct layers:

  1. Pre-training: Creates the foundational model (months, expensive)
  2. Fine-tuning: Adapts the model for specific tasks (days, moderate cost)
  3. Preference optimization: Aligns model outputs with human values (weeks, moderate cost)
  4. Prompting: Guides model behavior at runtime (seconds, cheap)
  5. RAG: Incorporates external knowledge on-the-fly (minutes, low cost)

Only the first three actually change the model. The last two are runtime modifications accessible to most users.

Why it matters: Understanding these layers is crucial for making informed decisions about AI implementation, resource allocation, and ethical considerations. It helps clarify what's possible at different stages of AI development and use.

Who should pay attention: If you're involved in decision-making about AI, whether as a policymaker, business leader, developer, educator, or concerned citizen, this knowledge will help you navigate the AI landscape more effectively.

"Training a model", I don't think those words mean what you think they mean

Do you know what "model training" means when you hear people say things like "the AI companies are training models on the user data", "the model is trained not to respond to illegal queries" or "I trained the model to on the company documents"? Don't worry, there's a good chance they don't either. Model training can me a lot of different things in different contexts.

Meaning of "training" before Large Language Models

Before Large Language Models people would say things like "we trained a model to predict what movies you will like based on what you watched" or "the model learns what the customer likes as they shop". Or "the model is trained to predict how likely you're to buy a house". This all sounds very fancy but in fact, it was no more sophisticated than slightly more complicated line going through a bunch of data points on a chart in Excel. You have a function that plots that graph, you feed it one number and it spits out another.

As you can imagine, it's easy enough putting more data into an Excel spreadsheet and improving the function to fit a better line to it as we go. In those days of data science all of those sentences sort of meant this one thing. Fitting a better line to a bunch of data points so that when we had a new input on the X axis we'd get a better prediction of the corresponding point on the Y axis. The reason we needed data science and learning was that we needed more axes and actually deciding what the units on those axes were was a big deal. Back then it was call feature engineering.

How LLMs changed the meaning of "training"

NONE of that is true of Large Language Models (for the most part). People still occasionally talk about functions and curves in this context such as the loss function or the activation function but they play a different role than the functions of old that plotted lines through a bunch of data points.

There is a very easy technical way to talk about training and learning Large Language Models but in effect none of it is like the learning and training the old models. This is a bold statement, because in fact, there is a lot of continuity in the mathematics and logic behind these models - it is easy to point to papers going back to 1950s that contain the same ideas. But for a non-expert, it is much easier to jettison any of the old ideas about machine learning or model training and start over.

So what does model training mean? There are two ways to answer this question.