I'm working on a Unified Theory of Cognition that would cover both language models and human minds exploring aspects of cognition that we can see in a different light now that Large Language Models have appeared. This is not meant as a theory of how language models work on a technical level but as a theory of how humans work and how this new understanding can help us grasp better both human cognition and the strengths and limitations of language models.

The core insights can be summarised in 12 slogans which together cover the main outlines of the theory.

  1. Schemas and propositions: Humans process meaning with schemas but communicate it through propositions. As a result much of the debate about cognition is actually about propositions but conducted as if they were the schemas they instantiate.
  2. Heuristics and biases: Bias is the weight in our cognitive system that predisposing us to express a schema into one kind of proposition rather than another. A heuristic is the process of reweighting a bias towards a different kind of proposition. It is definitely NOT the removal of a bias, just its change.
  3. Tasks and orientation: For a variety of reasons we don't have good access to the level at which schemas and biases operate and we can only infer what schemas might look like from the properties of propositions. Unfortunately, we can only do that indirectly through the same process we are trying to investigate. When are faced with a cognitive task, our entire system of cognition orients itself towards solving the task as it presents it to ourselves through schemas, not the task as expressed in propositions. For example, when we are tasked with giving examples of reasoning, the cognition required for that is oriented towards the task of giving examples which is not the same as the task of self-reflection. We cannot therefore rely on our subjective experience of cognition without a significant amount of effort and triangulation. This does not mean we cannot employ introspection, only that it has limits.
  4. Symbols and heroes: One of the most seductive things our cognition does when it orients itself inward is to create propositions about symbols. Symbols then stand for a complex inventory of schemas and propositions but by their gravitational pull, they mislead us into thinking of them as the 'great men of semantics'. This can be very useful when we have processing units that work well with symbols (logical systems or computers which are logic processors) but it is entirely parasitic on the schemas that we then use to process the logical propositions. In a very real sense, symbols are not real.
  5. Icons and embodiment: In semiotics, symbols are contrasted with icons which get its meaning not through arbitrary association between form and reference but rather from some embodied similarity with the thing they are referring to. A photo of a person, the sound, the order of things. There is an iconic component to most symbols. Even the logical operator 'and' can be used iconically: "we watched a movie and went home" means something different than "we went home and watched a movie". In fact, icons and symbols are both types of propositions that express schemas.
  6. Index and contents: Indices are the third part of the semiotic triangle and they get their meaning through association. Smoke indicates fire, barking indicates dogs, etc. In a real sense, all symbols and icons are actually indices. They point to a richer world of schemas that have the potential to express a number of propositions. Just like the item in an index of a book, they reference a whole universe of schemas and potential propositions. Listing all the propositions that are possible to make out of the schemas pointed at by an index is impossible.
  7. Meanings and dictionaries: One of the ways in which the tasks and orientation problem shows up is by us falsely associating dictionaries with meanings of the words listed in them. Dictionaries are just another form of text (propositions) made with the words. They depend entirely on schemas that they cannot describe. In fact, it is impossible to use a dictionary if one does not already have command of most of the language.
  8. Wisdom and morphology: One thing that is easily overlooked in the focus on propositions is the fact that the words used to express them gave forms. This means that whatever process is being used to generate complex semantic wholes happens alongside an even more 'computationally' complex process of transforming the forms of words. The complexity of this varies greatly among languages and seems to have no impact on the ability to express complex propositions. That should matter. We must be able to explain human ability to process incredibly convoluted formal changes to words as much as to express propositions about the nature of the universe. No theory of cognition is complete without this.
  9. Text and contexts: All of our thinking about reasoning starts with simple propositions that can be expressed as clauses in natural language and lemmas in logic. Almost the entire linguistic and psychological enterprise has been focused on discovering the rules that can be employed to (re)construct our full reasoning ability out of these rules. But we have learned that we cannot construct a complete linguistic and cognitive facility out of combining simple propositions. This is both an empirical fact but also a truth that follows from the nature of these phenomena. The direction we must follow is not to think of texts as simply bigger and more complicated clauses but rather to think of clauses as smaller but no less complicated texts. A single word uttered or written has the same complexity as a series of novels.
  10. Tools and reasons: So far, we have been committing the fallacy of thinking of reason (cognition and language) as an discrete entity encoded entirely in the neural substrate of an individual human. But this is not how most of the tasks humans orient themselves towards are completed. Reason or cognition as an inventory of schemas and associated biases that help convert them into propositions does not rely solely on this inventory. It uses tools - both internal and external. The internal tools are attention, working memory, long-term memory (in all its guises) and the external tools are all human artefacts that supplement these internal tools. These include bodies, objects we manipulate, notebooks, mechanical devices, abstract formalisms and other humans and their groupings. Any theory of reason that tries to abstract away from the tools of reason will be fatally incomplete. It is easy to confuse the matter through seemingly mystical statements like 'memory is not an individual'. This is a useful metaphor but obviously not helpful. The memory as a tool of reason lives full inside the individual (in whatever way it is encoded) but that individual memory is only one of the tools of reason supplemented by artefacts that socialise individual memories.
  11. Reasons and judgements: We think of the three Kantian critiques in the wrong order. We start with pure reason and treat practical reason and judgement as an afterthought. But in fact, judgement in context is the foundation of reason whereas pure reason is only an abstraction used to orient ourselves towards a particular set of tasks that are fairly complex and difficult and therefore assume undue salience in our thinking about reason. Epistemology seems to be a study of reason but in fact it is as much a study of ethics and aesthetics. In fact, the ethical and aesthetic judgements pose a much more significant puzzle than logical ones. And the core dilemmas and paradoxes of logical reason stem from its confrontation with practical aesthetic and ethical judgements.
  12. Mutants and selections: The fundamental principles is that there is no selection without mutation. Survival of the fittest is in fact the continued reproduction of the mutants. So far, we've thought of cognition as a unitary property of individuals. But, individuals vary widely across all of the characteristics mentioned above. Sometimes they do so widely and wildly. We must therefore also account for the fact that all the achievements of reason are the result of a convergence of mutants rather than an imperfect reproduction. We cannot think of reason purely through the ends reason is put in the services of, because those ends are achieved through a staggering variety of means and we must therefore also always catalog the mutants when we try to say anything about the convergence of their efforts.

What makes this a 'unified' theory of cognition that is designed to encompass both Large Language Models and humans? For each of these 12 slogans, we can tell a story both of AI and human successes and failures. This theory does not try to be a technical recipe for constructing an better synthetic intelligence or modelling human intelligence more accurately. Instead, it tries to outline a more complete and realistic range of phenomena that both synthetic and natural cognition must account for. This is very much a theory compatible with the bitter lesson and expands it beyond artificial intelligence - constructing a model of cognition based on how we think we think is doomed to failure.