Advanced Claude Prompting: CoT, Few-Shot & XML Tags That Work

Q: How do I know how many few-shot examples to include?

Start with three to five examples. More is not always better — quality and diversity matter more than quantity. If you have a large library of examples, use dynamic few-shot selection (semantic search to pick the most relevant examples per input) rather than a fixed set. Too many examples can crowd out the actual task content in the context window.

← Back to Claude API Hub

Technique Quick Reference

Technique	When to Use	One-Line Pattern
Chain-of-Thought	Complex reasoning, maths, debugging	`"Think through this step by step before answering."`
XML Structuring	Multi-section prompts, document analysis	`<instructions>...</instructions><document>...</document>`
Few-Shot Examples	Consistent output format, classification	`Input: X → Output: Y` repeated 2–3 times
Role Prompting	Domain expertise, tone control	`"You are a senior [role] with 10 years of experience in [domain]."`
Meta-Prompting	Improving a prompt itself	`"Rewrite this prompt to be clearer and more specific."`
Self-Consistency	High-stakes reasoning	Run 3–5 times, take the majority answer

Copy any pattern above directly into your system prompt or user message.

Copy-Paste Prompt Library

Ready-to-use prompts. Copy, adjust the bracketed placeholders, and paste into your system prompt or first user message.

Chain-of-Thought — Reasoning Tasks

text

[TASK DESCRIPTION]

Before giving your final answer:
1. Identify what is being asked
2. List the key facts or constraints
3. Work through the logic step by step
4. State your final answer clearly

Show your reasoning at each step.

XML Structure — Multi-Section Analysis

xml

<task>
[What you want Claude to produce]
</task>

<instructions>
- [Instruction 1]
- [Instruction 2]
- [Output format or length requirement]
</instructions>

<context>
[Background information, reference material, or data to analyse]
</context>

<output_format>
Return your response as [JSON / bullet list / table / prose paragraphs].
</output_format>

Few-Shot Classification — Consistent Labels

text

Classify each customer message into one of: [BUG_REPORT, FEATURE_REQUEST, BILLING, GENERAL].

Message: "The export button stopped working after yesterday's update."
Category: BUG_REPORT

Message: "Can you add dark mode to the dashboard?"
Category: FEATURE_REQUEST

Message: "I was charged twice this month."
Category: BILLING

Message: [paste new message here]
Category:

Role Prompt — Domain Expert

text

You are a senior [role] with 15 years of experience in [specific domain].
Your communication style is direct and precise — you do not use filler phrases.
You give concrete recommendations, not vague considerations.

When you are uncertain, you say so explicitly and explain what additional information would resolve the uncertainty.

[Task or question]

Negative Prompt — Format Control

text

[Task description]

Rules:
- Do NOT start with phrases like "Certainly!", "Of course!", or "Great question!"
- Do NOT use bullet points. Write in flowing prose.
- Do NOT mention competitor products.
- Do NOT ask clarifying questions — make reasonable assumptions and state them.
- Keep your response under [N] words.

Self-Consistency — High-Stakes Decisions

text

[Task requiring careful reasoning]

Approach this problem three times independently, each time starting from scratch with fresh reasoning. After all three attempts, compare your answers. If they agree, state the consensus answer. If they disagree, explain why and give the answer you are most confident in.

Meta-Prompt — Prompt Improvement

text

Here is a prompt I am using for [task]:

---
[Paste your current prompt]
---

Identify three specific weaknesses in this prompt that might cause inconsistent or poor-quality outputs. Then rewrite the prompt to address those weaknesses. For each change, explain what you changed and why.

Prompt Chaining — Step 1 (Outline)

text

I need to produce a [document type] on the topic: [topic].

Your task is to produce a structured outline only — no body text yet.

Requirements:
- [N] main sections
- Each section should have a clear, descriptive heading
- Under each heading, list 2–3 bullet points describing what that section will cover
- The outline should address: [specific angle or audience]

Technique 1: Chain-of-Thought Prompting

Chain-of-thought (CoT) prompting is the technique of instructing Claude to reason through a problem step by step before giving a final answer. Instead of jumping straight to a conclusion, Claude works through the problem out loud.

Why It Works

Language models make fewer reasoning errors when they generate intermediate reasoning steps. When Claude writes out its thinking, it creates a kind of cognitive scaffold — each step becomes a stable basis for the next. This mirrors what humans do when we write out our work on a difficult maths problem or outline an argument before writing an essay.

Basic CoT: "Think step by step"

The simplest CoT technique is just adding a phrase that triggers step-by-step reasoning:

text

A store sells apples for £0.40 each and bananas for £0.25 each. If a customer buys 7 apples and 12 bananas, what is the total cost?

Think through this step by step before giving the final answer.

Adding "think step by step" (or equivalents like "reason through this carefully" or "work through this systematically") causes Claude to break the problem into sub-steps rather than guessing at a direct answer.

Structured CoT

For more complex problems, you can structure the reasoning explicitly:

text

You are a security engineer reviewing this Python code snippet for vulnerabilities.

Before reaching your conclusion, work through the following steps:
1. Identify all user-supplied inputs
2. Trace how each input flows through the code
3. Check for any sanitisation or validation at each point
4. Identify potential attack vectors (injection, overflow, authentication bypass, etc.)
5. Rate each vulnerability by severity

Only after completing all five steps, provide your final security assessment.

By specifying the thinking structure, you get more thorough, systematic analysis than a direct "find vulnerabilities" request.

CoT Trades Speed for Accuracy

Chain-of-thought prompting produces longer responses and uses more output tokens, which means it costs more and takes longer. Use it when accuracy on complex reasoning tasks matters — mathematical problems, logical analysis, code debugging, multi-step planning. For simple, well-defined tasks, CoT is unnecessary overhead.

Technique 2: XML Structuring for Complex Prompts

When your prompt has multiple distinct sections — instructions, examples, data, constraints — Claude can lose track of what is what when everything runs together as plain prose. XML tags are an effective solution.

Claude has been trained to recognise XML-style tags as meaningful structural delimiters. Using them makes large, complex prompts dramatically easier for Claude to parse correctly.

Example: Document Analysis with XML Structure

xml

<task>
Analyse the customer feedback below and produce a structured report.
</task>

<instructions>
- Identify the top 3 themes across all feedback items
- For each theme, quote 2 supporting examples from the data
- Rate overall customer satisfaction from 1 to 10
- Identify one priority improvement that would address the most negative feedback
</instructions>

<output_format>
Return your analysis as a JSON object with keys: themes, satisfaction_score, priority_improvement
</output_format>

<feedback_data>
[paste customer feedback here]
</feedback_data>

Compare this to receiving the same information as five un-demarcated paragraphs. The XML structure makes it unambiguous which part is the task, which is the instructions, and which is the data to be analysed.

When to Use XML Structuring

Prompts with more than three distinct sections
Prompts that mix instructions with large chunks of data or reference text
Any prompt where you have observed Claude confusing instructions with data or vice versa
Multi-document analysis tasks where you want Claude to process each document distinctly

Technique 3: Few-Shot Prompting at Scale

The beginner guide introduced few-shot prompting with three examples. Advanced few-shot prompting involves more deliberate example selection and design.

What Makes a Good Example Set

Not all examples are equally useful. The best few-shot examples:

Cover the range of variation: Include examples that represent different types of input, not just the most common case. If you are classifying sentiment, include clearly positive, clearly negative, and borderline neutral examples.
Demonstrate the edge cases: The cases where Claude is most likely to fail are the ones where examples help most. Include a few examples of tricky edge cases with correct answers.
Are consistent in format: Every example should follow exactly the same input-output format. Inconsistency in examples produces inconsistency in outputs.
Are representative of real data: Examples that look unlike your actual data will not transfer. Use examples from the same distribution as your real inputs.

Dynamic Few-Shot Selection

In advanced applications, you do not hard-code a fixed set of examples. Instead, you dynamically select the most relevant examples for each input using semantic similarity search:

Store a large library of input-output examples with their embeddings
When a new input arrives, compute its embedding
Select the three to five most semantically similar examples from the library
Include those examples in the prompt for that specific input

This dynamic approach produces significantly better results than static examples because the examples are always maximally relevant to the specific input being processed.

Technique 4: Meta-Prompting (Using Claude to Write Your Prompts)

Meta-prompting is the practice of using Claude to generate or improve prompts for other Claude calls. It sounds recursive, but it is genuinely useful.

Prompt Generation

If you know what you want but are struggling to write an effective prompt, describe the task to Claude and ask it to write the prompt for you:

text

I need to write a system prompt for a Claude-powered customer service agent for a software company. The agent should:
- Only discuss topics related to our product (a project management tool)
- Escalate to a human agent if the user expresses serious frustration
- Always ask for the user's account ID before looking up any information
- Maintain a professional but friendly tone

Write the most effective system prompt possible for this use case. Explain the key choices you made.

Claude will often produce a better initial prompt than you would have written from scratch, because it understands its own behaviour well.

Prompt Critique and Improvement

Show Claude your existing prompt and ask it to critique it:

text

Here is the system prompt I am using for my AI assistant:

[paste your prompt]

Identify three specific weaknesses in this prompt that might cause inconsistent or low-quality outputs. Then rewrite the prompt to address those weaknesses. Explain what you changed and why.

This meta-prompting cycle — generate, critique, refine — is how experienced AI engineers develop production-quality prompts.

The Anthropic Workbench Has a Prompt Generator

The Anthropic Console Workbench includes a built-in prompt generator that uses Claude to help you write system prompts. Describe what you want your agent to do, and the tool generates an optimised system prompt as a starting point. It is an excellent place to begin any new prompt engineering project.

Technique 5: Negative Prompting

Negative prompting — explicitly stating what Claude should not do — is underused by most developers. Positive instructions tell Claude what to focus on; negative instructions tell Claude what to avoid. Both are needed for complete specification.

Common Negative Prompt Patterns

Format exclusions: "Do not use bullet points. Write in flowing prose paragraphs."
Content exclusions: "Do not mention competitor products or make comparisons to alternative solutions."
Behaviour exclusions: "Do not ask clarifying questions. Make reasonable assumptions and state them in your response."
Preamble exclusions: "Do not start your response with phrases like 'Certainly!' or 'Of course!' Go directly to the answer."
Scope exclusions: "Only discuss the technical implementation. Do not discuss business justification or ROI — that is out of scope."

Technique 6: Self-Consistency Sampling

Some tasks — particularly complex reasoning or creative decisions — are sensitive enough that you cannot trust a single model output. Self-consistency sampling runs the same prompt multiple times at higher temperature and aggregates the results.

How Self-Consistency Works

Run the same prompt three to five times with a temperature between 0.5 and 1.0
Compare the outputs across runs
Select the answer that appears most frequently (for classification tasks) or that is best by some evaluation criterion (for open-ended tasks)

For critical classifications, code generation for high-stakes applications, and any task where a single wrong answer has significant consequences, self-consistency dramatically improves reliability compared to a single run.

When Self-Consistency is Worth the Cost

Self-consistency runs the inference multiple times, multiplying your token cost. It is worth the cost when:

The task requires complex multi-step reasoning where a single run is unreliable
The consequences of an incorrect output are significant
You are doing one-time analysis on important data, not processing at high volume

Technique 7: Prompt Chaining

Some tasks are too complex for a single prompt. Breaking them into a sequence of smaller prompts — where the output of one becomes the input to the next — produces better results than trying to do everything in one call.

Example: Research Report Pipeline

Rather than one massive prompt that says "Research this topic and write a structured report", decompose it:

Prompt 1 — Outline generation: "Given this topic, generate a structured outline for a 2,000-word report. Include five main sections with three sub-points each."
Prompt 2 — Section drafting: Pass the outline to a second prompt: "Using the outline below, write the full text for Section 1 only. 400 words, formal tone."
Prompt 3 — Review and integrate: Pass all sections: "Review these separately drafted sections for consistency in tone and terminology. Identify any contradictions or gaps, then write a unified introduction and conclusion."

Each step is simpler, better-specified, and easier to verify. This is the foundation of the agentic workflows we cover in Module 5.

Error Propagation in Prompt Chains

A mistake in an early step of a prompt chain propagates through to all later steps. Always validate the output of each step before passing it to the next. For automated pipelines, build validation checks between steps — a simple classification prompt that asks 'Does this output meet the required format? Yes or No' can catch errors before they cascade.

Summary

Advanced prompt engineering is what separates developers who get inconsistent, mediocre results from those who build reliable, high-quality AI systems.

The techniques covered in this post:

Chain-of-thought: Make Claude reason step by step before answering
XML structuring: Use tags to organise complex, multi-section prompts
Few-shot prompting: Use carefully selected examples to communicate subtle patterns
Meta-prompting: Use Claude to write and improve your own prompts
Negative prompting: Explicitly exclude unwanted behaviours
Self-consistency: Run multiple samples and aggregate for reliability
Prompt chaining: Break complex tasks into sequential, validated steps

In our next post, we explore one of Claude's most powerful capabilities for complex reasoning: Claude Extended Thinking: How to Unlock Deep Reasoning.

Frequently Asked Questions

When should I use chain-of-thought prompting?

Use chain-of-thought prompting when the task involves multi-step reasoning — maths problems, code debugging, logical analysis, or strategic planning. For simple factual retrieval or creative generation, it adds unnecessary token cost. See the Anthropic chain-of-thought guidance for additional patterns.

What is the best way to structure few-shot examples?

The best few-shot examples are representative of your real input distribution, cover edge cases, and follow a perfectly consistent format. Inconsistent example formatting is the leading cause of inconsistent outputs. The Prompting Guide's few-shot section has further reading on example selection strategies.

How do I know how many examples to include?

Start with three to five. More is not always better — quality and diversity matter more than quantity. If you have a large library of examples, use dynamic few-shot selection (semantic search to pick the most relevant examples per input) rather than a fixed set.

Can I combine multiple techniques in one prompt?

Yes — combining techniques is standard in production. A typical pattern: XML structure to organise a complex prompt, chain-of-thought to enforce reasoning steps, and few-shot examples to demonstrate the target output format. For agent-scale prompt patterns, see Claude Tool Use Explained and Claude Agentic Loop Explained.

Structured learning: The LLM Engineering & RAG course(Coming soon) covers prompt engineering as Lesson 3: Prompt Engineering Techniques(Coming soon) — with a broader focus on structuring prompts for production pipelines, output parsing, and cost control.

This post is part of the Anthropic AI Tutorial Series. Don't forget to check out our previous post: Prompt Engineering for Claude: The Complete Beginner's Guide.

Part of the Claude AI Masterclass. See the Claude API Complete Guide for the full AI engineering learning path.