What happens when you hit enter

You type a message. You press enter. A second or two later, an answer arrives.

The earlier chapters in this module each explained one part of what happens in that gap — the prediction loop, the context window, why the prompt matters so much. This chapter puts them in order. One message, traced from the keypress to the finished answer.

The single most useful thing this trace reveals: the model is one step in a longer pipeline. Most of what happens — and most of what separates a good AI product from a poor one — happens in the stages around it.

Walk it stage by stage

One request · seven stages · step through it

Stage 1 / 7You

You hit enter

Until this moment your message is just text in a box. Pressing enter hands it to the product.

What the request holds right now

Your message

This stage is you.

What the trace shows

Seven stages. Step through them above; the prose here is the map, not a re-run.

Stage 1 — you hit enter. Until this moment your message is just text in a box. Pressing enter hands it to the product — ChatGPT, Claude.ai, an internal tool a team built, whatever surface is in front of you.

Stage 2 — the product assembles the prompt. This is the stage almost nobody pictures, and it is the most important one. The model does not receive your message. It receives a bundle the product builds: a hidden system prompt (standing instructions the product author wrote — tone, rules, role), the conversation history so far (every earlier message of yours and every earlier reply), any documents you uploaded or that the product retrieved on your behalf, and then, last, your new message. All of it stacked into one block of text.

Stage 3 — the bundle travels. That block leaves the device and goes over the internet to the model provider’s servers. Where exactly, and what is retained, is its own subject — where your data goes covers it.

Stage 4 — the model reads it as tokens. On the server, the bundle is split into tokens — small chunks of text. The whole bundle has to fit inside the context window; if it doesn’t, something was dropped or compressed back at stage 2.

Stage 5 — the prediction loop runs. Now the model does its one job: given everything in front of it, predict the next token, append it, predict again. This is the loop from the first chapter of this module. It is the only stage where the model is “thinking,” and it is doing the same thing it always does — continuing a pattern, one token at a time.

Stage 6 — tokens stream back. Each token, as the model commits to it, is sent back across the network and rendered on your screen. This is why an answer appears word by word rather than all at once. The streaming is not a stylistic flourish — it is stage 5 made visible. If the product needs the model to use a tool — a calculator, a search, a database lookup — stage 6 pauses here: the product runs the tool, feeds the result back in, and the loop resumes.

Stage 7 — the turn is saved. The exchange — your message and the model’s reply — is added to the conversation history the product holds. Which means the next time you hit enter, stage 2 includes this turn. The model itself kept nothing. The thread continues because the product re-assembles and re-sends it, every single turn.

Where the model actually is

Count the stages again. Seven. The model is stages 4 and 5. Everything else — assembling the bundle, moving it, rendering the answer, holding the conversation — is the product and the network.

This is not a technicality. It reframes a lot of everyday questions:

“Which AI is best?” is partly a question about the model (stages 4–5) and partly a question about the product around it (stages 1, 2, 6, 7). Two products on the same model can feel very different, because stages 2 and 6 differ.
“The AI forgot what we discussed.” The model did not forget — it has no memory between turns. Stage 2 either didn’t include the earlier part, or it fell out of the window at stage 4. Memory is a product behaviour, assembled fresh each turn.
“Why is it slow?” Slowness can be the model (stage 5, a longer answer takes more loops), the network (stages 3 and 6), or the product doing retrieval and tool calls (stage 2 and the stage-6 pause). Different stages, different fixes.

Why this matters for the business

When a team evaluates, builds, or buys an AI tool, the instinct is to focus on the model. The trace says: the model is necessary, but most of the product — and most of what will make the tool good or frustrating in daily use — lives in the other five stages.

How well does stage 2 assemble context — does it bring in the right documents, keep the relevant history, drop the noise? How does stage 6 handle tools — can it actually look things up and act, or only talk? How does stage 7 manage state across a long working session? A team that understands the seven stages can ask sharper questions about an AI tool than one comparing model names — because it knows the model is one stage of seven, and the other six are where most products quietly win or lose.