Artificial IntelligenceAnthropicCloud

Claude on AWS Bedrock: Production Setup Guide

TT
TopicTrick
Claude on AWS Bedrock: Production Setup Guide

Calling the Anthropic API directly works perfectly for most projects. But enterprise organisations often have requirements that go beyond what a direct API key allows: AWS-managed billing, data residency guarantees, VPC-private inference with no public internet traffic, compliance logging via CloudWatch, and integration with existing AWS IAM governance.

AWS Bedrock solves all of these. It is Amazon's managed service for running foundation models — including Claude — inside your AWS account, with your existing security controls, without data leaving your VPC.

What is Claude on AWS Bedrock?

Claude on AWS Bedrock is Claude's inference capability delivered through Amazon's fully managed model hosting service. Rather than calling Anthropic's API directly, your application calls the Bedrock Runtime through boto3, and AWS handles authentication via IAM roles, routes traffic through your VPC, and logs invocations to CloudWatch — with identical token pricing to the direct Anthropic API and no vendor lock-in at the request format level.

This guide takes you from zero to a production-ready Claude deployment on Bedrock: model activation, IAM policy, boto3 integration, VPC PrivateLink, CloudWatch observability, and a clear comparison to help you decide which approach is right for your workload.


AWS Bedrock vs Anthropic API: Which Should You Use?

  • Use the Anthropic API when you are building a SaaS product, a startup, or any workload where you control all the infrastructure and data residency is not a hard requirement
  • Use AWS Bedrock when your organisation has AWS-first procurement, when data must provably not leave a specific AWS region, when you need VPC-private inference, when you need integration with CloudWatch, S3, or AWS IAM, or when a central AWS bill is required

Bedrock pricing for Claude is identical to Anthropic's direct pricing per token. There is no Bedrock premium. The trade-off is configuration complexity versus governance.


Step 1: Enable Claude Models in the Bedrock Console

By default, foundation model access in Bedrock is disabled. You must request access for each model.

  1. Open the AWS Console and navigate to Amazon Bedrock
  2. In the left navigation, click Model Access
  3. Click Manage Model Access
  4. Find Anthropic in the provider list and tick the models you want: Claude Sonnet, Claude Haiku, Claude Opus
  5. Accept Anthropic's usage policy and click Save Changes
  6. Access is usually approved within minutes. Status will show Access granted

Model Access is Per Region

Bedrock model availability varies by AWS region. Claude is available in us-east-1 (N. Virginia), us-west-2 (Oregon), eu-west-1 (Ireland), eu-central-1 (Frankfurt), ap-southeast-1 (Singapore), and others. Check the Bedrock documentation for the current availability matrix. You must enable models separately in each region you plan to use.


    Step 2: IAM Policy for Bedrock Inference

    Create a least-privilege IAM policy that allows only the Bedrock inference actions your application needs.

    json

    Attach this policy to:

    • An IAM role used by your EC2 instances, ECS tasks, or Lambda functions — using instance profiles or task roles, not access keys
    • An IAM user only for local development — never in production

    Step 3: boto3 Integration

    python

    Step 4: VPC PrivateLink — Zero Public Internet Traffic

    For strict data residency, configure a Bedrock VPC endpoint so inference traffic never leaves AWS's private network.

    bash

    Once the endpoint is active, all boto3 Bedrock calls from EC2 or ECS resources in that VPC automatically route through the private endpoint. No code changes required — the DNS resolves to the private endpoint IP automatically.

    Security Group Configuration

    The security group attached to your Bedrock VPC endpoint must allow inbound HTTPS (port 443) from your application's security group. The application's security group must allow outbound HTTPS to the endpoint's security group. This ensures only your application's infrastructure can reach the private Bedrock endpoint.


      Step 5: CloudWatch Logging and Observability

      Bedrock does not log inference inputs/outputs by default for privacy reasons, but you can enable model invocation logging to S3 or CloudWatch Logs.

      python

      Once enabled, every model invocation creates a log entry with:

      • Model ID, region, and timestamp
      • Input and output token counts
      • Request and response content (if text delivery is enabled — consider data sensitivity)
      • Latency and stop reason

      Use these logs with CloudWatch Insights to build dashboards tracking cost per service, error rates, and latency percentiles.


      Cost Management

      • Bedrock pricing equals Anthropic API pricing — there is no additional cost for using the managed service
      • Use AWS Cost Allocation Tags on your IAM roles or via resource tagging to attribute Bedrock spend to individual teams or projects
      • Enable AWS Budgets alerts at $500 / $1,000 thresholds to prevent runaway inference costs during development
      • Use Claude Haiku for high-volume, lower-complexity tasks and Claude Sonnet for production-quality inference — do not default to Opus for everything
      • Bedrock Enterprise pricing is available for organisations with consistent high-volume usage — contact AWS for committed-use discounts

      Use Bedrock Batch Inference for Large Workloads

      Bedrock supports batch inference for large-scale, non-latency-sensitive workloads — call InvokeModel on a batch of prompts stored in S3 and retrieve results asynchronously. Batch inference costs 50% less than on-demand inference, making it ideal for bulk document processing, dataset annotation, or nightly report generation pipelines.


        Production Checklist

        • Model access enabled and confirmed in the correct region
        • IAM policy scoped to specific model ARNs — not a wildcard bedrock:*
        • Credentials come from instance/task role — no access keys in application code or environment variables
        • VPC endpoint configured for private inference if data residency is required
        • CloudWatch logging enabled with alerts on error rate exceeding 1%
        • AWS Budgets alert set for expected monthly Bedrock spend
        • Retry logic with exponential backoff for ThrottlingException and ServiceUnavailableException

        Summary

        AWS Bedrock is the right choice when your organisation needs Claude's capabilities inside existing AWS governance — VPC-private traffic, IAM access control, AWS billing, and CloudWatch auditing.

        • No code changes are needed switching from Anthropic SDK to Bedrock — the request/response format is identical, just routed through boto3 and a different endpoint
        • Model IDs follow the anthropic.claude-<model>-v<n>:0 format — check the Bedrock console for exact current identifiers
        • VPC PrivateLink ensures zero public internet egress for all model inference
        • Cost is identical to direct Anthropic API pricing — you gain governance, not additional cost

        Final post in the series: Final Knowledge Test + Anthropic AI Series Recap: Are You Ready to Build?.

        If you are evaluating whether Bedrock is right for your project, compare it against running Claude through the Claude API directly and the Claude model family guide which covers which models are available on Bedrock versus direct API.

        The AWS Bedrock official documentation is the definitive reference for IAM policy samples, current model IDs, and region availability. The Bedrock model IDs reference page is particularly useful — model identifiers change with new versions and should always be sourced from there rather than hard-coded from tutorials.


        This post is part of the Anthropic AI Tutorial Series. Previous post: Project: Build a Data Analyst Agent — CSV Insights in Plain English.

        External references: