RPO and RTO Planning AGENTS.md Template

Overview

Direct answer: This AGENTS.md Template for RPO/RTO planning provides a structured, copyable operating context for AI coding agents to coordinate recovery objectives across people, tools, data, and processes. It supports both single agent execution and multi agent orchestration.

The template standardizes how agents plan, execute, validate, and escalate during recovery planning exercises, outages, and post recovery reviews. It defines roles, handoff rules, memory and source of truth, tool governance, and security constraints so teams can run rehearsals and live recoveries with predictable outputs.

When to Use This AGENTS.md Template

During disaster recovery planning to align RPO and RTO targets with data retention and backup capabilities.
When coordinating multiple agents across planning, implementation, testing, and review phases.
For creating a reusable operating manual that can be version controlled alongside runbooks and backup catalogs.
To enforce tool governance and secure, auditable handoffs between agents and human reviewers.

Copyable AGENTS.md Template

# AGENTS.md

Project: RPO/RTO Orchestration for Cloud Services

Agent roster and responsibilities:
- Planner: defines objectives, coordinates tasks, ensures alignment with business continuity requirements
- Implementer: translates the plan into executable runbooks and scripts; executes recovery steps
- Reviewer: validates outputs, ensures compliance with backup policies
- Tester: runs failover tests and validates RPO/RTO targets
- Researcher: gathers data about backup sources, retention windows, and dependencies

Supervisor or orchestrator behavior:
- Orchestrator coordinates all agents, enforces constraints, logs decisions, and escalates risk

Handoff rules:
- Planner -> Implementer: after plan is finalized and risks are mitigated
- Implementer -> Tester: after scripts are prepared and changed code is committed
- Tester -> Reviewer: after tests pass; Reviewer signs off before production
- Researcher -> Planner: updates requirements and facts as sources change

Context, memory, and source of truth:
- Maintain a single source of truth: runbooks.md, backup catalogs, CMDB; memory persists decisions
- Use runbooks as the canonical source for steps and parameters
- All outputs must reference the source of truth IDs

Tool access and permission rules:
- Access tokens limited to required scopes; secrets stored in vault
- Production access restricted; require multi-person approval for production applications
- Tools: backup API, monitoring, ticketing, CMDB queries with read/write only where approved

Architecture rules:
- Microservice oriented with stateless agents; use idempotent operations
- Idempotent retries and clear traceability

File structure rules:
- Keep runbooks, templates, and scripts in dedicated folders under rpo-rto-planning

Data, API, or integration rules:
- Always verify API version compatibility; do not bypass authentication

Validation rules:
- Validate inputs and outputs; compare to SLOs; verify RPO/RTO against targets

Security rules:
- Do not log secrets; rotate tokens; encrypt sensitive data at rest and in transit

Testing rules:
- Unit tests for individual components; integration tests for end to end RPO/RTO flows; continuous validation with CI

Deployment rules:
- Deploy with canary or blue green; maintain rollback procedures

Human review and escalation rules:
- Escalate to on call engineer if RTO risk rises above threshold

Failure handling and rollback rules:
- If a step fails, rollback to last known good state; notify stakeholders

Things agents must not do:
- Do not modify production settings without approval; do not skip tests; do not drift from runbooks; do not run production changes unsupervised

Recommended Agent Operating Model

The operator model assigns clear responsibilities and decision boundaries for RPO and RTO planning. The Planner makes strategic decisions and quality checks, while the Implementer executes runbooks, the Tester validates outputs against targets, the Reviewer ensures compliance, and the Researcher keeps sources current. The Orchestrator coordinates, enforces constraints, and surfaces escalation when risk crosses thresholds. Escalation paths route critical issues to on call engineers and SRE leadership as needed.

Recommended Project Structure

/rpo-rto-planning/
  /ai-skills/agents-md-templates/
    /planner/
    /orchestrator/
    /implementer/
    /reviewer/
    /tester/
    /researcher/
  /runbooks/
  /templates/
  /workflows/
  /docs/
  /tests/

Core Operating Principles

Operate against defined RPO and RTO targets with auditable traces
Maintain a single source of truth for runbooks, backups, and CMDB
Enforce least privilege and explicit approvals for production changes
Ensure idempotent actions and clear rollback paths
Document decisions and rationale within the AGENTS.md context

Agent Handoff and Collaboration Rules

Planner to Implementer: deliver plan artifacts, risk mitigations, and acceptance criteria
Implementer to Tester: deliver executable runbooks, scripts, and logs
Tester to Reviewer: present test results, evidence, and compliance checks
Researcher to Planner: update data sources, retention, and dependency maps
Orchestrator: enforce transitions, track progress, and escalate if SLAs risk breach

Tool Governance and Permission Rules

Only required scopes for tools; secrets in vault; rotate tokens regularly
Production actions require on call approval and change control tickets
All tool calls must log with runbook IDs and agent IDs
Do not bypass authentication or access controls for backup services

Code Construction Rules

Write modular, testable code with clear inputs and outputs
Avoid hard coded values; reference configuration via runbooks
Document decisions and edge cases in the AGENTS.md context
Ensure idempotent operations and deterministic behavior

Security and Production Rules

Follow least privilege and secrets management best practices
Encrypt data in transit and at rest; monitor for anomalous access
Maintain audit trails for all production changes
Enforce approvals and change control for production deployments

Testing Checklist

Unit tests for individual components
Integration tests for data sources and backup APIs
End to end RPO/RTO validation with reproducible scenarios
Disaster recovery drills and post exercise reviews
Security and access control tests

Common Mistakes to Avoid

Ambiguous handoffs or missing acceptance criteria
Drifting from runbooks or undocumented changes
Unsecured secrets or overly broad tool access
Unclear source of truth or missing audit trails
Skipping validation of RPO/RTO results

FAQ

What is the purpose of this AGENTS.md Template for RPO/RTO planning?

It provides a copyable, project level operating context for AI coding agents to coordinate recovery objectives, runbooks, and governance across multi agent workflows.

How does it support multi agent orchestration across recovery activities?

It defines an agent roster, handoff rules, source of truth, and a supervisor orchestrator that coordinates tasks and enforces policy across agents.

How are RPO and RTO targets validated within this template?

RPO/RTO targets are validated by the Tester through failover tests, comparisons to SLOs, and runbooks that specify acceptance criteria.

How do agent handoffs work between planner, implementer, tester, and reviewer?

Handoffs occur at defined milestones with artifacts required for the next role; the orchestrator enforces transitions and escalation rules as needed.

What security considerations are embedded in this template for production scenarios?

Secrets are protected, production changes require approvals, and all actions are auditable with encryption and least privilege access.

Target User

Use Cases