Advanced Claude Prompting: CoT, Few-Shot & XML

The beginner prompt engineering guide covered the foundations: specificity, roles, context, format, examples, constraints. Those principles alone will take you a long way. But there is a class of prompting techniques that goes further — techniques that change how Claude reasons, not just what it produces.
What Are Advanced Prompting Techniques?
Advanced prompting techniques are structured methods that guide how a language model reasons through a problem, not just what it produces. Chain-of-thought, few-shot learning, XML structuring, and meta-prompting each address a specific failure mode in AI output — from shallow reasoning to format inconsistency — giving developers reliable, production-grade results from the Claude API.
This post covers the advanced tier. You will learn how to make Claude show its reasoning step by step, how to structure complex prompts for reliability, how to use Claude to improve its own prompts, and how to handle situations where a single run of the model is not reliable enough.
These techniques are used in production at scale by AI engineers building enterprise applications. Knowing them separates a developer who can use Claude from one who can master it. For the full background on the Claude API itself, see Claude Messages API Explained and Claude API Pricing and Tokens.
Technique 1: Chain-of-Thought Prompting
Chain-of-thought (CoT) prompting is the technique of instructing Claude to reason through a problem step by step before giving a final answer. Instead of jumping straight to a conclusion, Claude works through the problem out loud.
Why It Works
Language models make fewer reasoning errors when they generate intermediate reasoning steps. When Claude writes out its thinking, it creates a kind of cognitive scaffold — each step becomes a stable basis for the next. This mirrors what humans do when we write out our work on a difficult maths problem or outline an argument before writing an essay.
Basic CoT: "Think step by step"
The simplest CoT technique is just adding a phrase that triggers step-by-step reasoning:
Adding "think step by step" (or equivalents like "reason through this carefully" or "work through this systematically") causes Claude to break the problem into sub-steps rather than guessing at a direct answer.
Structured CoT
For more complex problems, you can structure the reasoning explicitly:
By specifying the thinking structure, you get more thorough, systematic analysis than a direct "find vulnerabilities" request.
CoT Trades Speed for Accuracy
Chain-of-thought prompting produces longer responses and uses more output tokens, which means it costs more and takes longer. Use it when accuracy on complex reasoning tasks matters — mathematical problems, logical analysis, code debugging, multi-step planning. For simple, well-defined tasks, CoT is unnecessary overhead.
Technique 2: XML Structuring for Complex Prompts
When your prompt has multiple distinct sections — instructions, examples, data, constraints — Claude can lose track of what is what when everything runs together as plain prose. XML tags are an effective solution.
Claude has been trained to recognise XML-style tags as meaningful structural delimiters. Using them makes large, complex prompts dramatically easier for Claude to parse correctly.
Example: Document Analysis with XML Structure
Compare this to receiving the same information as five un-demarcated paragraphs. The XML structure makes it unambiguous which part is the task, which is the instructions, and which is the data to be analysed.
When to Use XML Structuring
- Prompts with more than three distinct sections
- Prompts that mix instructions with large chunks of data or reference text
- Any prompt where you have observed Claude confusing instructions with data or vice versa
- Multi-document analysis tasks where you want Claude to process each document distinctly
Technique 3: Few-Shot Prompting at Scale
The beginner guide introduced few-shot prompting with three examples. Advanced few-shot prompting involves more deliberate example selection and design.
What Makes a Good Example Set
Not all examples are equally useful. The best few-shot examples:
- Cover the range of variation: Include examples that represent different types of input, not just the most common case. If you are classifying sentiment, include clearly positive, clearly negative, and borderline neutral examples.
- Demonstrate the edge cases: The cases where Claude is most likely to fail are the ones where examples help most. Include a few examples of tricky edge cases with correct answers.
- Are consistent in format: Every example should follow exactly the same input-output format. Inconsistency in examples produces inconsistency in outputs.
- Are representative of real data: Examples that look unlike your actual data will not transfer. Use examples from the same distribution as your real inputs.
Dynamic Few-Shot Selection
In advanced applications, you do not hard-code a fixed set of examples. Instead, you dynamically select the most relevant examples for each input using semantic similarity search:
- Store a large library of input-output examples with their embeddings
- When a new input arrives, compute its embedding
- Select the three to five most semantically similar examples from the library
- Include those examples in the prompt for that specific input
This dynamic approach produces significantly better results than static examples because the examples are always maximally relevant to the specific input being processed.
Technique 4: Meta-Prompting (Using Claude to Write Your Prompts)
Meta-prompting is the practice of using Claude to generate or improve prompts for other Claude calls. It sounds recursive, but it is genuinely useful.
Prompt Generation
If you know what you want but are struggling to write an effective prompt, describe the task to Claude and ask it to write the prompt for you:
Claude will often produce a better initial prompt than you would have written from scratch, because it understands its own behaviour well.
Prompt Critique and Improvement
Show Claude your existing prompt and ask it to critique it:
This meta-prompting cycle — generate, critique, refine — is how experienced AI engineers develop production-quality prompts.
The Anthropic Workbench Has a Prompt Generator
The Anthropic Console Workbench includes a built-in prompt generator that uses Claude to help you write system prompts. Describe what you want your agent to do, and the tool generates an optimised system prompt as a starting point. It is an excellent place to begin any new prompt engineering project.
Technique 5: Negative Prompting
Negative prompting — explicitly stating what Claude should not do — is underused by most developers. Positive instructions tell Claude what to focus on; negative instructions tell Claude what to avoid. Both are needed for complete specification.
Common Negative Prompt Patterns
- Format exclusions: "Do not use bullet points. Write in flowing prose paragraphs."
- Content exclusions: "Do not mention competitor products or make comparisons to alternative solutions."
- Behaviour exclusions: "Do not ask clarifying questions. Make reasonable assumptions and state them in your response."
- Preamble exclusions: "Do not start your response with phrases like 'Certainly!' or 'Of course!' Go directly to the answer."
- Scope exclusions: "Only discuss the technical implementation. Do not discuss business justification or ROI — that is out of scope."
Technique 6: Self-Consistency Sampling
Some tasks — particularly complex reasoning or creative decisions — are sensitive enough that you cannot trust a single model output. Self-consistency sampling runs the same prompt multiple times at higher temperature and aggregates the results.
How Self-Consistency Works
- Run the same prompt three to five times with a temperature between 0.5 and 1.0
- Compare the outputs across runs
- Select the answer that appears most frequently (for classification tasks) or that is best by some evaluation criterion (for open-ended tasks)
For critical classifications, code generation for high-stakes applications, and any task where a single wrong answer has significant consequences, self-consistency dramatically improves reliability compared to a single run.
When Self-Consistency is Worth the Cost
Self-consistency runs the inference multiple times, multiplying your token cost. It is worth the cost when:
- The task requires complex multi-step reasoning where a single run is unreliable
- The consequences of an incorrect output are significant
- You are doing one-time analysis on important data, not processing at high volume
Technique 7: Prompt Chaining
Some tasks are too complex for a single prompt. Breaking them into a sequence of smaller prompts — where the output of one becomes the input to the next — produces better results than trying to do everything in one call.
Example: Research Report Pipeline
Rather than one massive prompt that says "Research this topic and write a structured report", decompose it:
- Prompt 1 — Outline generation: "Given this topic, generate a structured outline for a 2,000-word report. Include five main sections with three sub-points each."
- Prompt 2 — Section drafting: Pass the outline to a second prompt: "Using the outline below, write the full text for Section 1 only. 400 words, formal tone."
- Prompt 3 — Review and integrate: Pass all sections: "Review these separately drafted sections for consistency in tone and terminology. Identify any contradictions or gaps, then write a unified introduction and conclusion."
Each step is simpler, better-specified, and easier to verify. This is the foundation of the agentic workflows we cover in Module 5.
Error Propagation in Prompt Chains
A mistake in an early step of a prompt chain propagates through to all later steps. Always validate the output of each step before passing it to the next. For automated pipelines, build validation checks between steps — a simple classification prompt that asks 'Does this output meet the required format? Yes or No' can catch errors before they cascade.
Summary
Advanced prompt engineering is what separates developers who get inconsistent, mediocre results from those who build reliable, high-quality AI systems.
The techniques covered in this post:
- Chain-of-thought: Make Claude reason step by step before answering
- XML structuring: Use tags to organise complex, multi-section prompts
- Few-shot prompting: Use carefully selected examples to communicate subtle patterns
- Meta-prompting: Use Claude to write and improve your own prompts
- Negative prompting: Explicitly exclude unwanted behaviours
- Self-consistency: Run multiple samples and aggregate for reliability
- Prompt chaining: Break complex tasks into sequential, validated steps
In our next post, we explore one of Claude's most powerful capabilities for complex reasoning: Claude Extended Thinking: How to Unlock Deep Reasoning.
Frequently Asked Questions
When should I use chain-of-thought prompting?
Use chain-of-thought prompting when the task involves multi-step reasoning — maths problems, code debugging, logical analysis, or strategic planning. For simple factual retrieval or creative generation, it adds unnecessary token cost. See the Anthropic chain-of-thought guidance for additional patterns.
What is the best way to structure few-shot examples?
The best few-shot examples are representative of your real input distribution, cover edge cases, and follow a perfectly consistent format. Inconsistent example formatting is the leading cause of inconsistent outputs. The Prompting Guide's few-shot section has further reading on example selection strategies.
How do I know how many examples to include?
Start with three to five. More is not always better — quality and diversity matter more than quantity. If you have a large library of examples, use dynamic few-shot selection (semantic search to pick the most relevant examples per input) rather than a fixed set.
Can I combine multiple techniques in one prompt?
Yes — combining techniques is standard in production. A typical pattern: XML structure to organise a complex prompt, chain-of-thought to enforce reasoning steps, and few-shot examples to demonstrate the target output format. For agent-scale prompt patterns, see Claude Tool Use Explained and Claude Agentic Loop Explained.
This post is part of the Anthropic AI Tutorial Series. Don't forget to check out our previous post: Prompt Engineering for Claude: The Complete Beginner's Guide.
