Skip to content

How the layers actually meet

The four chapters before this one each handled one axis. Layers (products, models, platforms). Families (the labs and their tiers). Licensing (open vs. closed). Modalities (text, image, audio, video). Taught one at a time, the picture is clean.

In a real operation, the four axes do not sit next to each other. They combine. Sales-email drafting is not a choice between models — it is a choice between a product wrapping a model from one family at one tier on a vendor platform versus a different product on a different platform versus an open-weights model running on company infrastructure. Same use case, four (or more) plausible specific combinations, very different operational profiles.

This chapter is about reading those combinations.

Take one concrete task: drafting first-pass outbound sales emails.

Address 1 — Product, frontier closed model, vendor platform. The team uses a sales-engagement product (Apollo, Salesloft, an AI-native upstart). The product’s “AI write” button calls a frontier closed model under the hood, on the vendor’s account, on the vendor’s platform. Per-seat pricing, integrated with the CRM out of the box, no engineering required. The product’s UX, prompts, and CRM integration are what the team is paying for; the model is incidental.

Address 2 — Vertical SaaS, mid-tier closed model, vendor platform. A vertical CRM (industry-specific — say, a CRM built for industrial distributors) adds a write feature using a workhorse closed model. Cheaper underlying model, but tuned with the vertical’s vocabulary and templates. The team is paying for the vertical fit, not the model.

Address 3 — Custom build, frontier closed model, vendor platform. Engineering builds a thin internal tool: a prompt template, a few CRM lookups, a call to a frontier model’s API. The model is exactly the one a team would pick separately; the wrapper is the team’s own workflow logic and data. Costs per call, not per seat. More work to build and maintain, more control over what goes in and what comes out.

Address 4 — Custom build, open-weights model, internal infrastructure. Engineering builds the same wrapper, but the model is an open-weights model running on the company’s own GPUs (or a hosted-but-private endpoint). No data leaves the company’s environment. Per-call costs are GPU-amortised, not pay-per-token. The tradeoff is operational: someone has to run the model, monitor it, update it when better weights arrive.

Same task. Four working addresses. The output quality may end up within a few percent across all four. The cost, the control, the data path, the operational overhead, and the lock-in all differ by an order of magnitude.

A reading that helps: each of the four axes contributes one trade-off to the final combination.

  • Layer trades off integration speed against control. Product lands fast with little control over the loop. Platform yields full control on a longer build curve.
  • Family trades off capability against cost. The frontier of the frontier (top tier from a major lab) is meaningfully better at the hard cases and roughly 10× the price of a workhorse model that handles the typical cases fine.
  • Licensing trades off operational simplicity against data-path control. Closed models are simpler to consume and harder to audit. Open-weights models are operationally heavier and place the data entirely inside the company’s perimeter.
  • Modality trades off ambition against reliability. Text is mature and reliable. Image generation works for concept and mood and fails on specifics. Voice is mature for transcription and synthesis, brittle for full agents. Video is young.

Each axis has a default that is wrong for some operations and right for others. The right combination for a use case is the one whose four trade-offs match what that use case can tolerate. A sales-email draft can tolerate occasional flat phrasing and almost no data sensitivity — so the product-layer / workhorse-tier / closed / text combination usually wins. A clinical-notes summariser tolerates no data sensitivity at all and has zero tolerance for hallucinated medication names — so a platform-layer / frontier-tier / closed (with enterprise data terms) / text combination, with substantial validation around it, becomes the realistic shape. A creative-mood-board generator tolerates a high error rate and benefits from breadth — so a product-layer / consumer-tier / closed / image combination is fine.

The substantive move is reading which combination the use case actually points at, not picking a model.

The cleanest mental shortcut: most operations that have AI working at any meaningful scale are running several combinations at once, not one.

A company of, say, three hundred people, two years into AI adoption, is plausibly running:

  • A consumer-tier product (ChatGPT, Claude, or both) on personal accounts for individual productivity — drafting, summarising, code, search.
  • An enterprise-tier seat of the same product family for some teams, with the data-not-trained-on commitment in place.
  • A vertical SaaS or two with embedded AI features inside the workflows where they already operate (CRM, customer support, marketing).
  • One or two custom internal tools built on a platform API, where the workflow was specific enough that no off-the-shelf wrapper fit.
  • For some companies: a small image-generation use case (marketing assets, mockups) on a product-layer tool.
  • For a few companies: an open-weights model running on internal infrastructure for one use case where data residency or unit economics demanded it.

That is not one stack. It is six combinations, each chosen — sometimes deliberately, sometimes by accident — because the trade-offs of that combination fit the use case in question. The right number of combinations for any given company is not one. It is the number of distinct use cases that have meaningfully different trade-off profiles.

This also means the governance problem (covered in The four worries) is not “the AI vendor” — it is the portfolio of combinations, each with its own data path, each with its own operational risks.

Three drifts are worth holding in mind, because each one slowly invalidates combinations that fit today.

Capability moves down the cost curve. What needed the top frontier tier in one generation typically runs fine on the next generation’s workhorse tier, at roughly a tenth the price per call. A custom tool built against the frontier tier when it launched is, a release cycle or two later, often running on a cheaper model under the hood. The combination drifts even if the team running it changes nothing.

Open-weights closes the gap on text, lags on multimodal. Open-weights models from the major labs (Llama, Mistral, DeepSeek, Qwen) close in on closed-model text quality over time, especially at the workhorse tier. The frontier of multimodal capability — image, audio, video, agentic tool use across modalities — lags more substantially. Closed models still hold a clear lead there.

Agents are a new axis, not a replacement. “Agentic” is sometimes pitched as the fifth column in the spectrum. The truer reading is that agents multiply the existing four — an agent is an architectural pattern on top of a model from a family at a tier on a layer in one or more modalities. Adding agents to the spectrum adds a question about how much autonomy the system has and what tools it can reach, on top of every existing question. It does not replace the questions; it adds one more axis.

Module 1 was about what a model is. Module 2 is about where AI shows up in the world. The reading that Module 2 ends with — and that the rest of the hub leans on — is this:

The interesting question is rarely which AI. It is which combination, for this use case, on the trade-offs this operation can actually tolerate. Module 3 (what AI can and can’t do) goes deeper into the capability axis. Module 4 (risk, trust, governance) goes deeper into the data-path and operational axes. Module 5 (putting AI to work) is where the combination becomes a project.

Holding the four axes in view — and the trade-off each one carries — is the move that makes those later modules sit usefully on top of this one.