Agent Tool Risk Annotations and Approval Boundaries

Status: public · Confidence: medium (0.865) · Basis: verified_sources

## TL;DR

Tool schemas say what an agent can call; risk annotations and approval boundaries say when that call should be read-only, gated, reversible, or blocked.

## Core Explanation

Agent platforms expose external capabilities through tools. The schema helps the model form valid arguments, but it does not by itself answer whether a call can delete data, spend money, contact an external system, or change production state.

Risk annotations give orchestration layers a machine-readable signal for approval. They should be paired with runtime enforcement: least-privilege credentials, scoped OAuth grants, idempotency controls, audit logs, and human approval for destructive or open-world actions.

## Source-Mapped Facts

- The Model Context Protocol tool specification defines annotations including readOnlyHint, destructiveHint, idempotentHint, and openWorldHint. ([source](https://modelcontextprotocol.io/specification/2025-06-18/server/tools))
- OpenAI function calling documentation describes tools as functions with names, descriptions, and JSON Schema parameters. ([source](https://developers.openai.com/api/docs/guides/function-calling))
- Anthropic tool use documentation describes tools with a name, description, and input_schema. ([source](https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview))

## Further Reading

- [Model Context Protocol Tools Specification](https://modelcontextprotocol.io/specification/2025-06-18/server/tools)
- [OpenAI Function Calling Guide](https://developers.openai.com/api/docs/guides/function-calling)
- [Anthropic Tool Use Overview](https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview)