Where AI fits
The previous chapter — scanning the operation — surfaces a short list of candidates. This chapter is the lens to look at each one through. Some of the candidates are obvious fits. Some look like fits and aren’t. The anatomy below sorts them.
Anatomy of a fittable task
Section titled “Anatomy of a fittable task”A task is a good candidate for AI when most of the following are true:
High volume. A work shape that happens often enough to repay the work of building or buying around it. A 200-times-a-week task is high volume. A one-off task isn’t worth automating.
Language-shaped. The input and output are mostly text — or things that can be read as text. Emails, documents, transcripts, tickets, listings, descriptions, code.
Tolerates approximate. Being 90% right is genuinely useful — either because a human catches the rest, or because the cost of an occasional miss is bounded.
Currently underloved. It’s boring. It’s repetitive. It’s the kind of work that gets delayed because nobody wants to start it.
When most of those are true, it’s a fit. When none of them are, it isn’t.
Test a task
Section titled “Test a task”A real task, run through the rubric, gives a fast read on whether it belongs in this column.
The bad fits
Section titled “The bad fits”Some candidates pass the four-property screen and still fail in practice. Five patterns are worth recognising before a pilot starts.
Safety-critical work without strict human oversight. Medical, legal, regulated decisions. The model can still draft and prepare — the failure mode is letting its output stand as the decision.
Work where exact numbers are the answer. A bare model is a poor calculator. Math, totals, lookups belong in a tool that uses a calculator or database under the hood (see when a model needs help — tools and memory).
Work where consistency matters more than judgment. If the same input must always produce the same output — pricing rules, eligibility cutoffs, compliance checks — that is a rule, not a prompt. Rules are auditable and stable; prompts are neither.
Work whose output can’t be verified. If the model writes something that goes straight to a customer or a system of record, and there’s no practical way to tell whether it’s right, the error mode is invisible. Invisible errors compound.
Work that depends on knowledge nobody has written down. Models can’t read minds. If a process lives in three senior employees’ heads — the unwritten “we don’t quote that customer below this margin,” “this SKU is never bundled with that one” — the model has no way to reach it. That work has to be captured before it can be automated.
A candidate that passes the four properties and trips a bad-fit pattern isn’t always disqualified. Some are workable with the right human review around them — a safety-critical drafting task with a strict reviewer in the loop, an exact-number task with a calculator tool wired in, an unverifiable task with a sampled audit pass. Some aren’t, no matter how the workflow is shaped. The four-property screen identifies plausibility. The bad-fit patterns identify what has to be true around the candidate for the pilot to work.
A frame that holds across functions
Section titled “A frame that holds across functions”Across nearly every fit on a typical scan, the shape is the same:
AI handles the draft. The team handles the verdict.
The question this frame answers is not “will AI replace this work?” but “which part of this work is slow, and can a draft compress it?” For most language-shaped operational work, some part of it can. The fit isn’t the whole job; it’s the slow part of the job.
That is enough to look at the work with fresh eyes — the drafts that get delayed, the classification done by hand, the reports assembled from three systems. Most of the fits in a mid-sized operation aren’t hidden. They’re in plain sight, in the work the team already finds boring.