Metaphors of Reasoning Models: What different analogies tell us about how they work and what are their limits

Motivation

This outline emerged after listening to the economist Noah Smith and podcast host Dwarkesh Patel in a podcast discussion not being able to articulate what makes reasoning models perform better. This despite the fact that they both engaged with experts on AI. I think that's because they were leaning too far into the "reasoning" metaphor and not thinking hard enough about how models actually work.

Metaphor: From reasoning to brainstorming

Limited utility of knowing how LLMs actually work

Under the hood, large language models only have one objective: Add a new token given all the previous tokens in the context window. We perceive it as generation of text but the model doesn't. It's just adding one token after another. And it is doing based on the geometric representations of the tokens in a massively multidimensional latent space. This latent space is a sort of distributional mirror of the training data which in turn mirrors the semantic space of human collective cognition. As with all mirrors, there are many imperfections and paradoxical regresses but, for most purposes, it's more than sufficient to produce the sort of results we are used to seeing from Large Language Models.

But it is almost impossible to conceptualise the journey from the actual working of the model into the output we experience when we are interacting with tools like Claude or ChatGPT. For the purposes of understanding reasoning, it is much more fruitful to imagine the model is actually composing the text based on what it “sees” (the prompt and any documents uploaded or found with search) and “what it knows” (knowledge and linguistic skill acquired during training).

Imagine the LLMs have humans capabilities and limitations

To help us understand reasoning, let's imagine LLMs actually have human-like knowledge and can construct answers based on that knowledge in response to the prompt in a similar way humans can. Once we start thinking about language models as more akin to humans and observe their behavior, we will also discover that they also have many limitation that are very similar to humans.

One such limitation relevant to reasoning models is the fact the language models may not always be able to access the knowledge they have somewhere in their "model brain" when it is most needed. This happens to people all the time - we forget relevant information when it is most needed. When we're primed by some context, it's right there, but when we're not, we have to make an effort to activate it.

We make mistakes and often castigate ourselves “I knew that, I can’t believe I didn’t use this knowledge”. This happens to Large Language Models, as well. That's often given as an example of the models being 'stupid' and fundamentally unreliable but that's just them being more like humans. We can also make mistakes despite having all the knowledge and skills at our disposal to avoid them.

Brainstorming and similar techniques as ways of overcoming limitations

Humans have developed many ways to activate the relevant knowledge when it is needed. There are even books describing these methods and sometimes even industries that facilitate their use. We make checklists, draw mind maps, we sit in front of mood boards, and we brainstorm. What these activities do for is to make sure that we are more likely to reach into the right part of our brain at the right time. We also sometimes prime our brains in undesirable ways (such as in the context of prejudice) and may want to make sure that we don’t just rely on the “conditioning” of our brains.

This sort of priming and conditioning is exactly what the reasoning model does. Riley Goodside once said, “the models are not talking, they are free-style rapping”. That is still true but the reasoning process is there to prime the models to ‘free-style rap’ in a more desirable direction. They basically generate a combination of a mood board and a checklist for themselves to use for generating the final response.

Users experience the model outputting a long chain of reasoning tokens that may be hidden under a Thinking or Reasoning drop down in a chatbot. Under the hood, the model is simply writing out tags. <reasoning> to start the reasoning chain and </reasoning> to stop it and start generating the final response.

Many people complain that what happens between the <reasoning> tags is not “real” reasoning. And they are right in the sense that it is not happening “inside” the model in the way we imagine human reasoning is happening “inside our brain”. But if we think of all those reasoning tokens as just a brainstorming session the model is having with itself, we can conceptualise their importance much more accurately and with less controversy.

We can imagine a student going through a mental checklist before they start answering an exam question or a mathematician standing in front of a whiteboard staring at a long list of equations to try to prime their brain for the essential insight that then just sort of happens. We may imagine a group trying to bounce ideas off of each other in a brainstorming session to make sure they consider as many possibilities as they can before making a decision. Or we can even imagine a checklist a team of experts uses before a complicated action (such as surgery or landing a plane) just to make sure they don’t skip any important steps.

There's nothing particularly special about the reasoning tokens just like there’s nothing particularly special about all the words that make up our brainstorming sessions or the checklists we use to follow steps. They're just there to provide context for what we're after. But the underlying "cognitive" mechanism for generating the text we use for priming is exactly the same as that we use for coming up with the final response. The same goes for the reasoning tokens.

Some consequences of brainstorming

This also explains why the results are improved even when some part of the reasoning chain is wrong or irrelevant. Just like in brainstorming sessions, “there are no bad ideas”. The purpose of brainstorming is not to come up with the right idea but set up an environment out of which the right idea is more likely to emerge.