Designing APIs for AI Agents

Introduction

In the evolving landscape of AI agents, the way we design APIs—particularly tools for large language models (LLMs) like Claude—demands a fresh perspective. Recently, while examining the prompt for Claude Code, I was struck by its detailed descriptions of tools. Unlike traditional OpenAPI specifications that focus primarily on data structures and endpoints, Claude Code's tool prompts emphasize behavioral guidelines, usage scenarios, and constraints. This approach highlights a key insight: APIs for AI agents aren't just about technical interfaces; they're about enabling probabilistic systems to make reliable decisions. Drawing from this, this article analyzes how APIs for AI agents—encompassing agent tools and MCP tools—should be designed. We'll explore differences from traditional APIs, distill lessons from Claude Code, and outline principles for creating "agent-friendly" APIs.

Traditional APIs vs. APIs for AI Agents

Traditional APIs, often documented via Swagger or OpenAPI, are built for deterministic clients like scripts or human developers. They prioritize data contracts: endpoints, HTTP methods, parameter types, and response schemas. Documentation typically lists what the API does (e.g., "POST /users creates a user") with minimal guidance on when or how to use it in context. Errors are handled via status codes, and behaviors like retries or concurrency are left to the client's implementation.

In contrast, APIs for AI agents must accommodate the probabilistic nature of LLMs. Agents like those in Claude Code don't "execute" code deterministically; they reason over prompts, infer intent, and chain tool calls. This introduces risks like hallucinations (e.g., inventing parameters) or inefficient usage (e.g., over-calling a tool). Thus, agent APIs shift focus:

From Structure to Behavior: While still using JSON Schemas for inputs/outputs, the design embeds decision-making aids.
Documentation as a Core Feature: Prompts aren't afterthoughts; they're integral, providing "SOPs" (standard operating procedures) to guide agent reasoning.
Resilience to Uncertainty: Designs include mechanisms for handling incomplete data, failures, and multi-step tasks, reducing fragility in long reasoning chains.

This isn't about reinventing APIs wholesale but adapting them for AI's strengths (e.g., natural language understanding) and weaknesses (e.g., lack of implicit knowledge).

Lessons from Claude Code’s Tool Design

1. Embedded Behavioral Constraints (The Hard Guardrails)

A core feature of Claude Code’s tools is the integration of non-negotiable rules directly into their definitions. These are "must-obey" policies that actively constrain the agent.

Examples: The Bash tool instructs the agent to "always use absolute paths" and avoid cd, preventing state drift. It also forbids using shell commands like grep, forcing the agent to use the safer, more structured Grep tool.
Why it Matters: These hard guardrails reduce the agent's error surface by design. They enforce best practices for safety, security, and reproducibility, preventing the agent from taking actions that are inefficient, unsafe, or difficult to predict.

2. Documentation as Behavioral Guidance (The Soft Decision Policy)

Distinct from hard constraints, behavioral guidance acts as a "user manual" that teaches the agent how to make good choices. This guidance is normative ("should") rather than binding ("must").

Examples: The TodoWrite tool is recommended for complex, multi-step tasks but discouraged for trivial ones. The agent is guided to "prefer specialized search tools over generic ones" and to "gather more evidence rather than guessing" when faced with uncertainty.
How it Differs from Constraints: Guidance shapes the agent's selection and sequencing of tools, while constraints limit its actions within a tool. Guidance builds good habits; constraints prevent bad outcomes.
Why it Matters: Agents operate with varying degrees of confidence. This soft policy layer helps them navigate ambiguity and learn idiomatic usage patterns without being overly rigid, fostering more effective and human-like problem-solving.

3. Support for Probabilistic Reasoning (Building Resilience to Uncertainty)

LLMs are inherently probabilistic, which can lead to unpredictable behavior in long-running tasks. Agent-friendly APIs anticipate this by building in resilience. This isn't about changing the reasoning itself, but about making the tool's interaction with that reasoning more robust.

Examples: To manage context limits, tools like Grep offer a head_limit to truncate large outputs. To ensure task integrity, MultiEdit provides atomic, all-or-nothing operations. To handle ambiguity, tools follow a "return empty, don't fabricate" policy and provide clear instructions for handling web redirects.
Why it Matters: These features act as shock absorbers. They prevent the agent from being overwhelmed by data, getting stuck in partial-failure states, or hallucinating results when information is absent. They make the agent's interaction with the world more predictable and reliable.

4. Scenario-Centric Usage with Expected Outcomes

Instead of just listing parameters, Claude Code's documentation outlines canonical scenarios, complete with procedural steps and expected results—including failures.

Examples: A tool's documentation might state, "If a file search returns no matches, the expected output is an empty list. The next logical step is to broaden the search pattern."
Why it Matters: Providing clear success and failure scenarios gives the agent a template for action and recovery. It helps the agent verify its work and teaches it how to self-correct when a tool call doesn't yield the expected result, reducing aimless retries.

Conclusion

The primary lesson from Claude Code is that designing APIs for AI agents requires a shift in focus: from what the API does to when, how an agent should use it and what to expect. The future of agent-ready tools lies not in reinventing API protocols but in enriching them with a rich behavioral layer. By embedding constraints, providing clear guidance, and designing for resilience, we can create APIs that empower agents to act as capable, reliable, and safe partners in complex tasks.