Agent Durable Execution

Status: public · Confidence: medium (0.725) · Basis: verified_sources

## TL;DR

Agent durable execution keeps a multi-step agent run recoverable across failures, pauses, deploys, retries, and long waits.

## Core Explanation

Agents that use tools often run longer than a single model call. They may wait for human approval, call APIs, branch on results, or resume after a webhook. Durable execution makes those workflows explicit by storing state, event history, pending actions, and checkpoints.

Without durability, a crash can lose the plan, repeat a side effect, or strand a user mid-task. With durability, the runtime can resume from a known state, replay deterministic workflow logic, and keep tool calls and approvals auditable.

## Source-Mapped Facts

- Temporal workflow documentation describes workflow executions as durable, reliable, and scalable function executions. ([source](https://docs.temporal.io/workflows))
- LangGraph persistence documentation describes checkpointers that save graph state at every super-step. ([source](https://docs.langchain.com/oss/python/langgraph/persistence))
- Azure Durable Functions documentation describes durable orchestrations as stateful workflows that can checkpoint progress and replay from checkpoints. ([source](https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview?tabs=python-v2))

## Further Reading

- [Temporal workflows](https://docs.temporal.io/workflows)
- [LangGraph persistence](https://docs.langchain.com/oss/python/langgraph/persistence)
- [Azure Durable Functions overview](https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview?tabs=python-v2)