Skip to content

Why prompts matter

Across two people using the same model on the same kind of task, the gap in output quality is almost always traceable to the prompt, not the model. A model is a function: text in, text out, and the text in shapes the text out. Everything that has been written under the banner of “prompt engineering” reduces to a single mechanism — set the model up well, and the most likely next words become useful ones.

A vague prompt produces a vague answer because the most statistically likely continuation of vague input is vague output. That isn’t laziness on the model’s part; it’s the loop doing exactly what it always does. The training data is full of vague prompts followed by generic, slightly stiff responses, because that’s the average shape of writing on the internet. The model has absorbed that distribution.

Type “write me an email” and the field of likely next words skews generic — because the field of “what came after ‘write me an email’ in the training data” is generic.

Specificity reshapes that field. Detail, context, constraint, and example narrow the distribution of likely next words toward the ones you actually wanted.

There’s no formula. But there are predictable ingredients.

Context. Who are you, what is the situation, what is the goal? “I run customer support at a 200-person SaaS company. We just shipped a billing change that’s confusing customers.”

Task. What exactly do you want? “Draft a one-paragraph apology email to a customer who got double-charged.”

Constraints. What must the answer respect? “Casual but professional. Don’t promise a refund — that’s handled separately. Mention that we’re already crediting the second charge.”

Examples. What does good look like? “Here’s how we usually write to customers: [paste an actual past email].”

Format. What shape should the answer take? “Three sentences max.”

You don’t need all five every time. You need enough to make the most likely next words be the ones you want.

Compare:

Write me an email.

versus:

I’m head of customer support at a 200-person SaaS company. A customer was double-charged due to our recent billing change. Write a one-paragraph apology email — casual but professional, don’t promise a refund (handled separately), mention we’re crediting the second charge. Three sentences max. Match the tone of this past email: [example].

Same model. Different output. Not because the model “tried harder” the second time, but because the most likely next words completely shifted.

Build up the prompt · watch the output shift
The prompt
Write me an email.
The model's output
Dear Customer, Thank you for reaching out. We apologize for any inconvenience this may have caused. Please let us know if you need any further assistance. Best regards, The Team
Toggle ingredients. Each one shifts the most likely next words.

The people who get the most out of these models tend to converge on the same handful of habits.

They write the prompt the way they would brief a sharp junior teammate on a piece of work: enough context to make the task make sense, the constraint they actually care about, an example of what good looks like.

They paste examples freely. Showing the model a past piece of writing — “match this tone” — almost always outperforms describing the tone in adjectives. The model is a pattern matcher, and an example is the cleanest pattern available.

They iterate. The first answer is rarely the final one. They tell the model what’s off, paste the previous answer back if needed, and run it again — the same loop they’d use giving feedback to a colleague.

They keep useful prompts. A prompt that worked well last week is reusable next week, often verbatim. Across a team, that builds into a small library of prompts that consistently produce good output — closer to a piece of standard operating procedure than to a one-off question.

They use the model to improve its own prompts. “Rewrite this prompt to be more specific and add the constraints I forgot” works embarrassingly well, and is one of the most common moves of fluent prompters.

Prompt quality is not just a personal skill. It’s a team capability.

The companies getting the most from AI right now are not the ones using the fanciest model. They are the ones whose teams have learned to brief AI well — context, constraint, example, iteration. The model is one third of the system. The prompt is another third. The verification habit is the last third.

This is also why a custom internal tool often outperforms generic ChatGPT for a specific task. The custom tool has the prompt pre-baked — context, format, constraints already loaded — so the user just provides the input. Less effort per use. Higher quality per use. That is prompt engineering invisible to the user, baked in by whoever built the tool.