Build a Customer Support Chatbot with Claude API

What Makes a Good Customer Support Chatbot?
A good customer support chatbot is built on three things: a well-structured knowledge base it can actually query, a system prompt that constrains its behaviour and sets its personality, and an escalation path that routes genuinely frustrated users to a human agent. Claude's conversational ability handles the language — your design handles the reliability.
Customer support chatbots have a well-earned poor reputation. They fail to understand questions, give irrelevant answers from a FAQ list, and ultimately frustrate customers into demanding a human agent. The difference between a bad support bot and a useful one is not the technology — it is the design of the knowledge base, the system prompt, and the escalation logic.
Claude's conversational ability, combined with a well-structured knowledge base and clear behavioural instructions, produces a support bot that genuinely helps. This project builds a complete customer support chatbot with a static knowledge base, multi-turn conversation management, prompt caching for cost efficiency, and a clear escalation path to human support.
What We Are Building
The chatbot handles these core responsibilities:
- Answers questions from a company knowledge base — products, policies, account management, troubleshooting
- Maintains conversation context across multiple turns — no need for users to repeat themselves
- Recognises when to escalate — complex complaints, billing disputes, and repeated-failure scenarios get handed to a human
- Handles out-of-scope questions gracefully — redirects rather than guessing or confabulating
Prerequisites
- Python 3.9 or later
- pip install anthropic
- An Anthropic API key set as ANTHROPIC_API_KEY
The Knowledge Base
The knowledge base is the most important component of a support bot. Claude should answer only from this content — not from its general training knowledge about your industry. Define it clearly in your system prompt.
The System Prompt
The Complete Chatbot
Extending the Project
- FastAPI endpoint: Wrap in an API server so any frontend (React, Vue, or a third-party chat widget) can connect to it
- Session storage: Store conversation history in Redis or PostgreSQL keyed by session ID to support concurrent users and persistent sessions
- Dynamic knowledge base: Load the knowledge base from a database or CMS so it can be updated without redeploying the application
- WebSocket streaming: Use the Claude streaming API to start showing the response character by character, reducing perceived latency for users
- Analytics: Log every conversation with metadata (escalation rate, topic classification, resolution status) to identify knowledge gaps
Prompt Caching Makes Support Bots Much Cheaper
The knowledge base system prompt in this project is large — potentially 10,000+ tokens for a comprehensive product. By adding cache_control to the system prompt, you pay full input token cost only on the first request of each cache period (5 minutes minimum). All subsequent requests reuse the cached version at 10% of the cost. For a high-volume support bot handling thousands of conversations per day, this is a significant cost difference.
Summary
A well-designed support chatbot has three success factors: a high-quality knowledge base, clear behavioural boundaries, and a reliable escalation path. Claude handles the conversational intelligence — you provide the domain knowledge and the rules.
- Use cache_control on the system prompt to reduce costs at scale
- Maintain full conversation history so Claude has context for follow-up questions
- Define explicit escalation triggers rather than relying on Claude to judge when to escalate
- Use a cheaper model (Haiku) for internal tasks like summary generation
Next project: Project: Build an Automated Meeting Notes Summariser.
For cost optimisation on high-traffic chatbots, see Claude Prompt Caching Guide — prompt caching on the knowledge base system prompt reduces costs by up to 90% for returning users. For RAG-powered chatbots that query large document sets, see Claude RAG: Retrieval Augmented Generation.
External Resources
- Anthropic Prompt Caching documentation — how to reduce costs by up to 90% on repeated system prompts.
- Anthropic conversation memory patterns — official guidance on managing multi-turn conversation history at scale.
This post is part of the Anthropic AI Tutorial Series. Previous post: Project: Build a Smart CV / Resume Analyser with Claude.
