Building with the Claude API
Go from user to builder. Learn how the Claude API works, make your first API call, and understand how developers integrate Claude into their own products.
From using Claude to building with Claude
You've been using Claude through the web interface — typing prompts, getting responses. But what if you want Claude to power a feature in your own app? An internal tool that summarizes customer tickets automatically. A Slack bot that answers employee questions from your company wiki. A product feature that generates personalized recommendations.
That's what the Claude API is for. It lets developers send prompts to Claude programmatically and get responses back — turning Claude from a tool you use into a tool you build with.
How the API works
The Claude API is a REST API — you send an HTTP request with your prompt, and Claude sends back a response. Every interaction with Claude, whether through the web interface, Claude Code, or a custom app, ultimately goes through this API.
1. Your App sends a message (with your API key) to api.anthropic.com
2. Claude API receives the request and routes it to the model
3. Claude Model processes the prompt and generates a response
4. Response is sent back to your app to display or use
The flow is simple:
- Your app sends a message (with your API key) to
api.anthropic.com - Claude processes the prompt and generates a response
- The response is sent back to your app
- Your app displays or uses the response however it wants
Key API concepts
Messages API
The core API is the Messages API. You send a list of messages (conversation turns) and Claude responds:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[
{"role": "user", "content": "Explain API rate limiting in one paragraph."}
]
)
print(message.content[0].text)
That's it. Five lines to send a prompt and get a response.
Models
You choose which Claude model to use in every API call:
| Model | API name | Best for | Relative cost |
|---|---|---|---|
| Claude Opus | claude-opus-4-6 | Complex analysis, nuanced tasks | $$$ |
| Claude Sonnet | claude-sonnet-4-6 | Balanced everyday use | $$ |
| Claude Haiku | claude-haiku-4-5-20251001 | Fast, simple tasks | $ |
System prompts
System prompts set persistent instructions for the entire conversation:
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
system="You are a customer support agent for Acme Corp. Be friendly, "
"concise, and always offer to escalate to a human if you can't "
"help. Never make up product features.",
messages=[
{"role": "user", "content": "How do I reset my password?"}
]
)
The system prompt shapes Claude's behavior for every response — it's the API equivalent of Project instructions.
Multi-turn conversations
Send conversation history to maintain context:
messages = [
{"role": "user", "content": "I'm building a task management app."},
{"role": "assistant", "content": "Great! What technology stack are you using?"},
{"role": "user", "content": "React frontend, Python FastAPI backend."},
{"role": "assistant", "content": "Solid choice. What feature are you working on?"},
{"role": "user", "content": "I need to add recurring tasks. How should I model this?"}
]
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=2048,
messages=messages
)
Claude sees the full conversation history and responds with context.
There Are No Dumb Questions
Do I need to send the entire conversation history every time?
Yes. The API is stateless — Claude doesn't remember previous calls. You send the full message history with each request. This gives you complete control over what Claude "remembers."
How much does it cost?
You pay per token (roughly per word). Prices vary by model — Haiku is cheapest, Opus is most expensive. A typical API call processing a few paragraphs costs fractions of a cent. Check Anthropic's pricing page for current rates.
Can I use the API for free?
Anthropic sometimes offers free trial credits for new accounts. After that, it's pay-as-you-go. The cost is very low for moderate usage — many developers spend less than $10/month during development.
Token economics
Understanding tokens is essential for API usage:
What is a token? — A word-piece. "Hello" is 1 token. "Anthropic" is 2-3 tokens. Roughly 1 token ≈ 4 characters of English text.
Input tokens — What you send (your prompt + conversation history). You pay for these.
Output tokens — What Claude generates (the response). These cost more than input tokens.
Max tokens — You set a cap on how long Claude's response can be. This controls cost and response time.
Context window — Claude's total capacity: input + output combined. 200K tokens for the latest models.
Cost optimization tips:
- Use Haiku for simple tasks (classification, extraction)
- Use Sonnet for most tasks (it's the best value)
- Reserve Opus for truly complex analysis
- Set
max_tokensto limit response length - Keep conversation histories trimmed — only include relevant turns
Tool use (function calling)
One of the API's most powerful features: Claude can use tools. You define functions, and Claude decides when and how to call them:
tools = [
{
"name": "get_weather",
"description": "Get current weather for a location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
]
message = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
tools=tools,
messages=[{"role": "user", "content": "What's the weather in Tokyo?"}]
)
Claude responds with a tool call request. Your app executes the function, sends the result back, and Claude incorporates it into its response. This lets Claude interact with databases, APIs, file systems — anything you can write a function for.
Streaming
For real-time applications, use streaming to display Claude's response as it's generated:
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Write a product announcement."}]
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)
Streaming makes your app feel responsive — users see words appearing in real time instead of waiting for the full response.
Real-world API use cases
| Use case | How it works | Model recommendation |
|---|---|---|
| Customer support bot | System prompt with company knowledge + tool use for ticket lookup | Sonnet |
| Document summarizer | Upload docs → summarize with structured output | Sonnet or Haiku |
| Code review tool | Analyze PRs, flag issues, suggest improvements | Sonnet |
| Content generator | System prompt for brand voice, structured templates | Sonnet |
| Data classifier | Categorize inputs into predefined buckets | Haiku (fast + cheap) |
| Research assistant | Extended thinking + tool use for web search | Opus |
Design an API-powered feature
50 XPGetting started with the API
Step 1: Create an account — Sign up at console.anthropic.com
Step 2: Get your API key — Generate a key in the dashboard. Store it securely — never commit it to code.
Step 3: Install the SDK — pip install anthropic (Python) or npm install @anthropic-ai/sdk (TypeScript)
Step 4: Make your first call — Use the code examples above to send your first prompt
Step 5: Build something real — Start with a simple use case and iterate
Knowledge Check
1.What is the Claude API?
2.Why do you need to send the full conversation history with each API call?
3.What does 'tool use' (function calling) enable Claude to do?
4.Which model offers the best balance of capability and cost for most API use cases?