Artificial IntelligenceSoftware DevelopmentProjects

Build Your First AI Coding Agent with the Claude API

TT
TopicTrick
Build Your First AI Coding Agent with the Claude API

Every AI coding agent you have heard of — Cursor, Devin, GitHub Copilot Coding Agent — is built on the same foundation: an LLM with access to tools that can read and write files, execute commands, and observe the results. The specific tools differ; the loop is the same.

What Is an AI Coding Agent, Exactly?

An AI coding agent is a loop: Claude decides which tool to call, your code executes that tool, the result feeds back to Claude, and the cycle repeats until the task is done. The agent reads files, writes code, runs tests, reads the results, and iterates — all without human intervention. Built with Claude's tool use API and a sandboxed executor, this architecture is the foundation of every major commercial coding agent.

In this project you will build that loop yourself. By the end, you will have a working AI coding agent that can:

  • Read any file in your project directory
  • Write and edit source files
  • Execute shell commands (run tests, install packages, run scripts)
  • List directory contents
  • Search files for patterns
  • Iterate on failures until the task is complete

This is not a wrapper around Claude Code. You are building the agent loop from scratch so you understand exactly how it works — and so you can extend it for your own use cases.


Prerequisites

bash

Python 3.11 or later. Set your API key:

bash

You should already understand how Claude's tool use works. If not, read Claude Tool Use Explained first.


The Architecture

The agent is built from three layers:

text

Claude never executes code directly — it returns JSON specifying which tool to call and what arguments to pass. Your Python code runs the tool, captures the output, and returns it to Claude. Claude decides what to do next. This loop repeats until Claude returns a final response with no tool calls.


Step 1: Define the Tools

Tools are defined as JSON schemas. Claude reads these to understand what is available and what arguments each tool expects.

python

Step 2: Implement the Tool Executor

The tool executor is the bridge between Claude's JSON requests and your filesystem. Security is paramount here — restrict all file operations to the project directory.

python

Path Traversal Protection

The _safe_path method resolves the full absolute path and checks it starts with the project root. This blocks directory traversal attacks like '../../etc/passwd'. Never skip this check — Claude may produce unexpected paths when reasoning about a task.


    Step 3: The Agent Loop

    The agent loop is the core of the system. It sends messages to Claude, handles tool calls, feeds results back, and repeats until Claude signals completion.

    python

    Step 4: Run It on a Real Task

    Create a test project to run the agent against:

    bash

    Now run the agent:

    python

    What you will observe:

    text

    The agent autonomously read both files, identified the fix, edited the file, ran the tests, confirmed all 5 passed, and reported completion. Total: 4 iterations, no human intervention.


    Step 5: Try a More Complex Task

    python

    The agent will: read calculator.py, decide where to add the function, check for import math or add it, write the function, read test_calculator.py, add test cases, run all tests, fix any issues, confirm pass.


    Common Failure Modes and Fixes

    Agent loops without progress: Add a stagnation check — if the same tool is called with the same arguments twice, break the loop:

    python

    Context window overflow on large files: Truncate file reads over a character limit and tell Claude:

    python

    Dangerous command execution: Extend the blocklist in run_command and consider adding an approval step for destructive operations in production agents.


    Full Project Structure

    text

    Key Takeaways

    • An AI coding agent is an agentic loop: Claude decides which tools to call → your code executes them → results feed back to Claude → repeat
    • Claude never executes code directly — it returns tool call requests as structured JSON; your code runs the tools safely
    • Path traversal protection in the file tools is non-negotiable — always resolve and validate paths before any filesystem operation
    • Blocklisting dangerous commands in the shell tool prevents the agent from accidentally executing destructive operations
    • The agent works best on well-defined tasks with testable outcomes — it can iterate on failing tests and confirm success automatically
    • Adding context truncation and loop detection makes your agent significantly more robust in production

    What's Next in the AI Coding Agents Series

    1. What Are AI Coding Agents?
    2. AI Coding Agents Compared: GitHub Copilot vs Cursor vs Devin vs Claude Code
    3. Build Your First AI Coding Agent with the Claude API ← you are here
    4. Build an Automated GitHub PR Review Agent
    5. Build an Autonomous Bug Fixer Agent
    6. AI Coding Agents in CI/CD: Automate Code Reviews and Fixes in Production

    This post is part of the AI Coding Agents Series. Previous post: AI Coding Agents Compared: GitHub Copilot vs Cursor vs Devin vs Claude Code.

    For the underlying concepts, see Claude Tool Use Explained, Claude Agentic Loop Explained, and Claude Structured Outputs and JSON. For security considerations when running agents in production, see Basic Threat Detection for Developers.

    External Resources