AGENTS.md template for distributed systems design teams | AGENTS.md Template
AGENTS.md template for distributed systems design teams guiding multi-agent orchestration, handoffs, tool governance, and human review.
Target User
Developers, platform teams, engineering leaders
Use Cases
- Define a repeatable AGENTS.md template for distributed systems design workflows
- Coordinate multi-agent orchestration and handoffs across planning, implementation, review, and testing
- Encode tool access, secrets, and production deployment rules
- Capture context, memory, and source-of-truth for auditable agent decisions
- Provide escalation to human review for high-risk changes
Markdown Template
AGENTS.md template for distributed systems design teams | AGENTS.md Template
# AGENTS.md
Project role: Platform Architect, AI Engineer, SRE, Security Engineer, QA, Product Manager
Agent roster:
- Planner: defines tasks, acceptance criteria, success metrics
- Implementer: builds components, integrates tools, writes code
- Reviewer: validates design, checks for correctness and safety
- Tester: executes tests, reproduces failures, reports results
- Researcher: gathers data, sources, and domain knowledge
- Domain Specialist: provides subject-matter nuance and constraints
- Orchestrator: central supervisor coordinating agents, enforcing rules, logging decisions
Supervisor or orchestrator behavior:
- Maintains single source of truth in the project wiki and code repository
- Enforces constraints, memory boundaries, and context hygiene
- Assigns tasks, triggers retries, and escalates when risk thresholds are exceeded
Handoff rules between agents:
- Planner -> Implementer: when tasks are fully specified with constraints and acceptance criteria
- Implementer -> Reviewer: after a production-ready component is implemented
- Reviewer -> Implementer: for fixes and rework
- Implementer -> Tester: once integration points are ready
- Tester -> Planner: for re-planning if failures occur
- Researcher/Domain Specialist -> Planner: when new domain findings change requirements
Context, memory, and source-of-truth rules:
- All decisions stored in a central knowledge base with version history
- Context is limited to the current sprint/workflow; memory is periodically archived
- Source-of-truth includes design docs, unit tests, integration tests, run logs, and code in the repository
Tool access and permission rules:
- Tools: git, curl, http, shell, and mock services for testing
- Secrets: stored in a dedicated secrets manager; access granted by the orchestrator
- Production systems: only through approved CI/CD pipelines; audit trails required
Architecture rules:
- Microservices and adapters to ensure consistent interfaces
- Use a planner-driven, event-driven orchestration pattern
- Each agent acts with least privilege
File structure rules:
- README.md at root
- /ai-skills/agents-md-templates/ with per-agent responsibilities
- /workflows/ with AGENTS.md templates per workflow
- /docs/ with governance and decision logs
Data, API, or integration rules when relevant:
- Clear API contracts, input/output schemas, and versioning
- Idempotent operations and retry guards
Validation rules:
- Acceptance tests, contract tests, and end-to-end validations
- Data shape and schema conformance checks
Security rules:
- Secrets never logged
- Role-based access control enforced
- Production changes require approval gates
Testing rules:
- Unit tests for each agent, integration tests for handoffs
- End-to-end workflow tests simulating real scenarios
Deployment rules:
- CI/CD pipelines must run on every change
- Rollback procedures defined for failed deployments
Human review and escalation rules:
- Any failure beyond retry threshold requires human review
- Escalation path documented to engineering lead
Failure handling and rollback rules:
- Safe rollback to last known good state
- Explicit retry budget and timeouts
Things Agents must not do:
- Do not bypass approvals
- Do not leak secrets
- Do not drift away from the source of truthOverview
Direct answer: This AGENTS.md template codifies a repeatable operating manual for distributed systems design teams using AI coding agents. It governs a multi-agent workflow that can scale from single-agent experiments to full multi-agent orchestration across design, implementation, validation, and deployment stages.
The AGENTS.md template is a concrete, copyable guide that encodes roles, handoffs, memory rules, tool governance, and escalation paths. It enables both individual agents and orchestrated teams to operate with explicit boundaries, auditable decisions, and safe production behavior.
When to Use This AGENTS.md Template
- When you need a repeatable, auditable operating context for a distributed systems design project using AI agents.
- When coordinating multiple specialized agents (planner, implementer, reviewer, tester, researcher, domain expert) with clear handoffs.
- When tool governance, secrets handling, and production risk must be codified.
- When you require human review and escalation paths for high-risk decisions.
Copyable AGENTS.md Template
# AGENTS.md
Project role: Platform Architect, AI Engineer, SRE, Security Engineer, QA, Product Manager
Agent roster:
- Planner: defines tasks, acceptance criteria, success metrics
- Implementer: builds components, integrates tools, writes code
- Reviewer: validates design, checks for correctness and safety
- Tester: executes tests, reproduces failures, reports results
- Researcher: gathers data, sources, and domain knowledge
- Domain Specialist: provides subject-matter nuance and constraints
- Orchestrator: central supervisor coordinating agents, enforcing rules, logging decisions
Supervisor or orchestrator behavior:
- Maintains single source of truth in the project wiki and code repository
- Enforces constraints, memory boundaries, and context hygiene
- Assigns tasks, triggers retries, and escalates when risk thresholds are exceeded
Handoff rules between agents:
- Planner -> Implementer: when tasks are fully specified with constraints and acceptance criteria
- Implementer -> Reviewer: after a production-ready component is implemented
- Reviewer -> Implementer: for fixes and rework
- Implementer -> Tester: once integration points are ready
- Tester -> Planner: for re-planning if failures occur
- Researcher/Domain Specialist -> Planner: when new domain findings change requirements
Context, memory, and source-of-truth rules:
- All decisions stored in a central knowledge base with version history
- Context is limited to the current sprint/workflow; memory is periodically archived
- Source-of-truth includes design docs, unit tests, integration tests, run logs, and code in the repository
Tool access and permission rules:
- Tools: git, curl, http, shell, and mock services for testing
- Secrets: stored in a dedicated secrets manager; access granted by the orchestrator
- Production systems: only through approved CI/CD pipelines; audit trails required
Architecture rules:
- Microservices and adapters to ensure consistent interfaces
- Use a planner-driven, event-driven orchestration pattern
- Each agent acts with least privilege
File structure rules:
- README.md at root
- /ai-skills/agents-md-templates/ with per-agent responsibilities
- /workflows/ with AGENTS.md templates per workflow
- /docs/ with governance and decision logs
Data, API, or integration rules when relevant:
- Clear API contracts, input/output schemas, and versioning
- Idempotent operations and retry guards
Validation rules:
- Acceptance tests, contract tests, and end-to-end validations
- Data shape and schema conformance checks
Security rules:
- Secrets never logged
- Role-based access control enforced
- Production changes require approval gates
Testing rules:
- Unit tests for each agent, integration tests for handoffs
- End-to-end workflow tests simulating real scenarios
Deployment rules:
- CI/CD pipelines must run on every change
- Rollback procedures defined for failed deployments
Human review and escalation rules:
- Any failure beyond retry threshold requires human review
- Escalation path documented to engineering lead
Failure handling and rollback rules:
- Safe rollback to last known good state
- Explicit retry budget and timeouts
Things Agents must not do:
- Do not bypass approvals
- Do not leak secrets
- Do not drift away from the source of truth
Recommended Agent Operating Model
Roles and decision boundaries: The Planner designs the task graph and acceptance criteria for distributed systems components. The Implementer translates plan into code and integrations. The Reviewer validates architecture and safety constraints. The Tester validates end-to-end behavior. The Researcher and Domain Specialist provide external knowledge or regulatory/domain constraints. The Orchestrator coordinates handoffs, enforces memory and source-of-truth rules, and mediates escalation.
Escalation and autonomy: Agents operate autonomously within their domain, but escalate to human review when risk thresholds or failure budgets are exceeded. Handoff boundaries ensure each phase produces a bounded, testable artifact before the next phase begins.
Recommended Project Structure
/ai-project/
/ai-skills/agents-md-templates/
planner.md
implementer.md
reviewer.md
tester.md
researcher.md
domain-specialist.md
orchestrator.md
/workflows/
distributed-systems-design/
AGENTS.md
README.md
/docs/
governance.md
decision-logs/
/services/
gateway/
orchestrator/
/tests/
unit/
integration/
/docs-artifacts/
knowledge-base/
Core Operating Principles
- Single source of truth for decisions and memory
- Clear, testable handoffs between agents
- Least-privilege access to tools and production systems
- Idempotent, auditable operations
- Explicit escalation when risk exceeds thresholds
Agent Handoff and Collaboration Rules
- Planner defines tasks with acceptance criteria and success metrics before Implementer starts
- Implementer seeks Reviewer validation before moving to Tester
- Researcher/Domain Specialist can trigger Planner re-scoping if domain findings alter requirements
- Orchestrator enforces memory hygiene and source-of-truth updates during every handoff
Tool Governance and Permission Rules
- Only approved tools may be invoked; all actions must be logged
- Secrets must never be logged or exposed in outputs
- Production deployments require CI/CD gate approvals
- Access is granted per-role with least privilege
Code Construction Rules
- Follow typed interfaces and contract-first design
- All integration points have tests and versioned APIs
- Code changes are validated by automated tests before handoffs
- Do not introduce global mutable state without justification
Security and Production Rules
- Secrets never appear in logs, outputs, or artifacts
- Production changes require explicit approval gates and review
- Audit trails are maintained for all production access
Testing Checklist
- Unit tests for each agent
- Integration tests for handoffs
- End-to-end workflow tests simulating real-world scenarios
- Regression tests after each change
Common Mistakes to Avoid
- Skipping explicit handoff criteria
- Over-privileging tools or production access
- Bypassing the source of truth or memory rules
- Unbounded automation without escalation paths
Related implementation resources: AI Use Case for Corporate Event Managers Using Slack To Orchestrate Day-Of Venue Tasks Across Multi-Department Teams and AI Use Case for Sales Pipeline Reviews and Deal Risk Scoring.
FAQ
What is the purpose of this AGENTS.md template?
It provides a copyable, project-level operating manual for distributed systems design teams using AI coding agents, detailing roles, handoffs, and governance.
Who are the typical agents in this workflow?
Planner, Implementer, Reviewer, Tester, Researcher, Domain Specialist, and an Orchestrator that coordinates collaboration and memory.
How are handoffs between agents enforced?
Handoffs are governed by the Orchestrator with explicit criteria, state validation, and memory updates between stages.
What about security and production safety?
Secrets are never logged, access is least-privilege, and production deployments go through approved CI/CD gates with audit trails.
How is memory and source of truth managed?
Decisions are stored in a central knowledge base with version history; context is scoped to the current workflow and archived periodically.
What should I do if a workflow fails?
There is a defined retry budget and escalation path to human review before deployment or rollback.