<aside> <img src="/icons/help-alternate_gray.svg" alt="/icons/help-alternate_gray.svg" width="40px" />

This is a draft by Dominik Lukeš in response to the question of whether LLMs can “understand”. It is more about how to think about meaning (and semantics) than how Large Language Models work. There are no references but it is based on an approach to language practiced by Construction Grammar and Cognitive Semantics. Comments welcome via @techczech or [email protected].

</aside>

Summary in two slides

This guide started life as these two slides in my presentations on how AI works. They still fairly accurately summarise what it is about but obviously, there is a lot of nuance and background that this tries to develop.

Human semantics

SCR-20240502-ikvw.png

LLM Semantics

SCR-20240521-ucyu.png

Claude.ai generated TLDR;

<aside> <img src="/icons/robot_yellow.svg" alt="/icons/robot_yellow.svg" width="40px" />

What do we mean by meaning: Three kinds of semantics

<aside> <img src="/icons/robot_gray.svg" alt="/icons/robot_gray.svg" width="40px" />

Notion AI generated TL;DR: Semantics (the study of meaning) can be viewed from three perspectives:

  1. Representational semantics: Straightforward meanings that can be pointed to or easily defined
  2. Relational semantics: Meanings derived from context and relationships between words
  3. Logical semantics: Concerned with truth conditions of statements rather than word meanings

Understanding these distinctions is crucial for grasping the complexities of language and meaning, especially in discussions about AI language models.

</aside>

Semantics is a fancy word for meaning and confusingly also the word for the study of meaning. And just as all of the most famous terms associated with language, “meaning” does not have a clear unambiguous definition. I will not try to give one here because it would be futile. But I’ll try to contrast three ways of looking at meaning and show that much of the debate about Large Language Models and semantics comes from confusing the three distinct perspectives.

Representational semantics

Representational semantics is how most people intuitively use the word “meaning” as in “what does this word mean”? In this view, meaning is the sort of thing that shows up on the right hand side of a dictionary. This kind of meaning is something you can point to, show a picture of, give a straightforward description of. It’s the kind of thing that when you say you see it, it’s either right or wrong.

The first meanings of words we acquire as children are representational. We point at things and people and say the word: “mama”, “truck”, “apple”. We may even describe feelings “sad”, “happy”, “hungry” or actions “run”, “play”, “cry”. Later we expand this and include words like “government” or “desolute” or even abstractions like “love”. They are harder to point to but are still fairly representational.

One property of “representational” semantics is that it is easy to translate. We can say “dog” is “chien” in French but “hund” in German. We can even say it for things like “in” is “v” in Czech.

Another property of representational semantics is that we can usually tell a very straightforward story of how we learned the representational meanings of words.

Relational semantics

But represenational semantics soon runs into problems. A simple glance at the right side of any dictionary entry reveals that pretty much no word has just one “meaning”. They all have multiple related or completely unrelated meanings. So “cat” could be a “domestic feline of genus something or other” or a “jazz musician” but only when used by some people. And all of a sudden saying that “cat” is “chat” in French does not seem as straightforward.

But there are more difficulties. Many words do not have a straightforward representation. Like “the” or “in” or even words like “get”. We cannot learn them by pointing at something or giving a single example. We can only learn them from context. And many of them, we could not even begin to define. What is the meaning of “the”? Why do we use it in a sentence like “I saw a book with all the pages torn out” or “Ah, you are the Mr Smith!”

And asking “how do you say ‘the’ in German” is a lot less informative as is asking how you say it in Russian which does not even have a definite article. This can be very confusing to people. I once had a student ask me “How do you say ‘have’ in Czech. As in ‘I have arrived’.” But that’s our representational instincts leaking into relational semantics.

When we learn something by pointing at it we’re learning a sort of name for a type of thing or action or property. But when we learn things from context, we learn a complex set of relationships. From these often emerge subtle ways in which we look at the world. And this is one of the reasons becoming proficient at another language is so damn hard.

Let’s take the humble duo “in” and “on”. They have a very clear distinction in meaning “inside of” vs “on top of”. And they are also very easy to translate into Czech as “v” and “na” respectively. But what about a “bird in a tree”, “crowd in the streets”, “a person working in the garden”. Czech uses “na” or “on” for all of these. English thinks of trees, streets, and gardens as a sort of container and Czech as just a plain surface.