How to talk to AIs: Context Engineering 101

By
Spencer Ames
November 20, 2025

For readers who are already familiar with context engineering basics, you can find our advanced article here.


If prompt engineering is about “what to say,” context engineering is about “what the model should see.” In real applications, that difference decides cost, speed, and quality.

Andrej Karpathy, a foundational figure in contemporary AI, summarized it well: industrial-strength apps are less about a single clever prompt and more about filling the context window with the right information for the next step—instructions, examples, retrieved facts, tools, and state—while keeping irrelevant tokens out. Do too little and the model guesses. Do too much and you pay more and often get worse answers.

What “context” means

“Context” is the bundle of tokens the model reads before it answers: your instructions, the user’s question, a few examples, excerpts from a document, maybe tool definitions, and a bit of conversation history. You get a limited “attention budget,” and packing it well matters more than wordsmithing a single magic prompt.

Why it matters (even when windows are big)

Bigger windows don’t guarantee better results. Long inputs can blur focus, slow things down, and cost more. Good context engineering treats tokens like a scarce resource: include the smallest useful set that raises the odds of the right answer.

Context engineering vs. prompt engineering 

  • Prompt engineering: write clear instructions and maybe a couple of examples.

 

  • Context engineering: decide everything the model will see on this turn—instructions, examples, relevant snippets, tool info, and just enough history—then repeat that curation each turn. It’s an “information design” step, not just descriptive phrasing.

Five core principles

  1. Be explicit about the goal and the format

    Tell the model exactly what you want and what shape it should return. This reduces ambiguity and makes the output easier to use downstream.

     

  2. Only show what’s needed right now

    Don’t paste massive PDFs or long email threads. Pass short, relevant snippets or IDs/links to the source. You can always fetch more if needed—the goal is optimizing signal and filtering noise. Regardless of how you 

     

  3. Label the parts

    Delimit sections with a consistent style. Clear boundaries help the model tell task from data. When you choose a template, stick to it

    Here are two of the most simple and consistent delimitation styles:

    XML

    <INSTRUCTIONS> Answer only from Evidence. Return JSON (answer, citations[]) What is the late work policy? [S1] ... p.5 | [S2] ... p.2 </INSTRUCTIONS>

    <QUESTION> What is the late work policy? </QUESTION>

    <EVIDENCE> [S1] ... p.5 | [S2] ... p.2 </EVIDENCE>

    Brackets

    [INSTRUCTIONS] Answer only from Evidence. Return JSON (answer, citations[]) What is the late work policy? [S1] ... p.5 | [S2] ... p.2

    [QUESTION] What is the late work policy?

    [EVIDENCE] [S1] ... p.5 | [S2] ... p.2

     

  4. Use a couple of good examples—not an encyclopedia

    Examples are powerful, but repetition can push the model into copy-paste behavior. Keep them varied and minimal to teach the pattern without overfitting.

     

  5. Keep a short memory and tidy history

    For multi-step tasks, keep notes with what's been done and the current plan, and store bulky context outside the conversation.

     

  6. Evaluate outputs

    Treat every output as a draft and verify facts. Conduct sanity checks by asking questions like:
    Does the answer cite a real section? Is it short and on-topic? If uncertain, does it ask for the missing doc?

    Identify a success criteria to benchmark the answers on.

    For repeatable tasks, try different methods and models to figure out what works best.

    As always, check for bias, privacy, and policy compliance.

Common mistakes and how to fix them

 

  • Overstuffing: pasting entire documents. 

    • Fix: pass short snippets or references and fetch on demand.
       

  • Vague asks: “Summarize this” without an output shape.

    • Fix: specify fields.
       

  • Example ruts: Too many near-identical examples. 

    • Fix: use 1–2 varied examples.Fix: specify fields.
       

  • Mess inputs: mixed instructions and data

    • Fix: delimit sections

Further reading