What an AI model actually is

You type a question into ChatGPT. A second later, an answer arrives — fluent, on-topic, in the right register. Underneath that fluency is a single mechanism, and the rest of this hub builds on it.

The mechanism

An AI model is a pattern-matching machine over text.

It was shown vast amounts of writing — the internet, books, papers, code. From all of that, it absorbed how language behaves: which words tend to follow which, in what order, in what tone, after what setup. Not by memorising the text. By compressing the shape of how the text works into a set of internal numbers (its weights).

When you give it a prompt, it works through a single question: given everything I’ve seen, and given what’s in front of me right now, what word probably comes next?

It picks one. It adds that word to the prompt. It asks again. What word comes next now? It picks again. And again.

That’s the loop. That’s the whole machine.

See it happen

Watch a model think · one token at a time

The fastest way to learn anything is to

Pick a next word. The model would just pick the most likely one.

The model writes one piece at a time, each piece informed by what it has written so far. The answer assembles itself in the open. The streaming you see in a ChatGPT response is the mechanism itself, exposed — every token landing as the model commits to it.

What the loop explains

Once you hold the loop in your head, a lot of model behaviour stops being mysterious.

Fluency is the native skill. A model can write a smooth paragraph on almost any subject because smoothness is what it spent training absorbing. The output reads like writing because the mechanism is, in the strictest sense, the production of writing.

The same loop produces confident wrong answers. The model is continuing patterns, not retrieving verified facts. When the pattern that “looks right” happens to be invented, the prose carries it just as smoothly as it carries a true sentence. The industry calls this hallucination, which is a louder word than the mechanism deserves — it’s the same next-word prediction, on a stretch where plausible and true happen to part ways. The next chapter is dedicated to this.

The same loop is why two runs of the same prompt give two different answers. There’s deliberate randomness in which word the model picks among the likely candidates. The answers tend to converge on roughly the same content, but the exact wording shifts each time — and sometimes the substance does too.

The same loop is why prompts matter. The model can only condition on what’s in front of it. A prompt with detail, role, and constraint reshapes the field of “most likely next words” toward useful ones. Three words of vague prompt reshape it toward average ones.

What it can hold in its head

A model only sees what is currently in the conversation. Your message, plus its previous replies, plus any documents you have pasted in. That is it.

This window of what it can see is called the context window. Different models have different sizes. Some can hold a few pages. Some can hold a small library. Every model has a limit.

Once a conversation gets long enough, the early parts may fall out of the window. The model literally cannot see them anymore. It is not forgetting. The text was never in front of it on this turn.

Same reason a new chat starts blank. The model has no memory of yesterday’s conversation. Each chat is its own context. Whatever the model “knows” about you, you have to put in front of it.

The shape of what it can do

The mental model — a very good predictor of what to say next, given what’s in front of it — sorts AI claims into clean buckets.

Writing a first-draft email, summarising a long document, restating a clause in plainer language, translating a paragraph, generating five variations of a subject line: pattern-matching on text the model has seen a million versions of. Native territory.

Booking a flight, sending an invoice, updating a CRM record: action in the world, not produced words. Outside the loop. A model can be wired up to trigger such actions through tools (covered later), but the model itself only writes. Wire several of those triggers into a feedback loop and you get an agent.

Holding a six-month client relationship, knowing the team’s last conversation with this customer, remembering what was decided in last quarter’s review: persistent memory. The model has none of that between conversations. Memory, when it exists, is something a product builds around the model — not a property of the model itself.

The loop is a lot, and it is not nothing. A great many of the hours a company spends on writing — drafts, summaries, restatements, first passes — sit inside what this mechanism can do. It is also genuinely not understanding, judgement, or retrieval of verified facts. Everything else in this hub is, in one way or another, about which of those three the model is doing right now.