<aside>
💡
Draft from 5 July 2025.
</aside>
Introduction
Much of the discussion of AI and AI literacy remains at the ChatGPT launch level. The frontier has advanced dramatically since then. These are my current talking points about what the models can actually do today and how to explore the real capabilities.
Part 1: Understanding AI Capabilities
1. How AI Works - Language Models
Developing some basic intuitions about what's happening when you're using ChatGPT or other tools.
Note: language model based AI is not the same as data science AI
1.1 AI Capabilities = Inference + Orchestration
The capabilities of any AI system are a combination of:
- Inference: the 'raw intelligence' of the language model which is just a stream of tokens
- Orchestration: what the software such as ChatGPT capturing those tokens does with them
Example: one important model capability is to realise that it needs a tool and output a few tokens that say in effect 'please take the following code and run it'. The orchestration is then watching for those tokens and triggers a virtual machine that runs the code (for instance a calculation) - this is how Advanced Data Analysis works
1.2 Context Completion = How Models (raw intelligence) work
- the raw intelligence of the model generates tokens only based on what's presented to it and its 'world knowledge / skill' nothing else (no previous chats, no live updates) this is called 'context' and context management is a key problem to be aware of
- in practice, not everything that is in your context (mind) is necessarily being presented to the language model when it's generating the final response
2. Frontier of Language Model Capabilities
What the models can do today:
- the basic capability is similar to that what humans can do with text (understanding and generation) but not perfectly equivalent - jagged frontier
- models are best at languages including computer languages
- multimodality - take in text, images, audio and video as context