AGENTS.md Template: Cloud Native System Design
AGENTS.md Template for cloud native system design—copyable operating context for single-agent and multi-agent workflows.
Target User
Developers, platform teams, engineering leadership
Use Cases
- Defining cloud-native design workflow
- Coordinating IaC and Kubernetes deployments
- Governance for tool usage
- Handoff rules between agents
Markdown Template
AGENTS.md Template: Cloud Native System Design
# AGENTS.md
Project Role: Cloud-native Systems Designer (Lead Architect), Platform Engineer, DevOps Engineer, Security Engineer, Observability Engineer.
Agent roster and responsibilities:
- Planner: defines cloud-native design goals, constraints, and delivery plan for the architecture and IaC.
- Implementer: builds infrastructure-as-code, deployment scripts, and configuration for cloud-native services.
- Reviewer: validates architecture decisions, security controls, and performance expectations.
- Researcher: collects patterns, cloud-native design practices, and relevant RFCs or runbooks.
- Orchestrator: coordinates planning, execution, memory, and handoffs; enforces rules and maintains the single source of truth.
Supervisor or orchestrator behavior:
The Orchestrator monitors task progress, sequences steps, and ensures each agent completes its artifact before handing off to the next. It centralizes memory of decisions, sources, and constraints, and triggers escalations when gaps appear or SLAs are missed.
Handoff rules between agents:
When a task completes, the responsible agent writes a handoff note to memory, attaches artifacts, updates the design doc, and notifies the next agent. The orchestrator preserves provenance and ensures the next agent has the required context and access.
Context, memory, and source-of-truth rules:
Context lives in the project design doc and architecture artifacts stored in the repo. Memory is scoped to the current project and persisted in a memory store. Source of truth includes the design doc, infrastructure repo, and artifact store (for example, a versioned object store).
Tool access and permission rules:
Access to cloud tooling (cloud CLI, kubectl, terraform/pulumi, git, secret stores) must follow least-privilege and approval gates. Secrets must never be hard-coded; use a vault or secret manager. Credentials are ephemeral and rotated regularly.
Architecture rules:
Apply cloud-native design patterns: microservices with service mesh, event-driven communication, API gateway, per-service data stores, and observability by default. Ensure idempotent operations and well-defined contracts.
File structure rules:
Maintain a lean, navigable project tree focused on cloud-native design.
infrastructure/
kubernetes/
manifests/
helm-charts/
terraform/
modules/
networks/
security/
observability/
apps/
core-service/
identity-service/
data-service/
ops/
runbooks/
docs/
Data, API, or integration rules when relevant:
Define data contracts, response formats, and integration points; use versioned APIs; surface changes through a contract-first approach.
Validation rules:
Require architecture review sign-off, IaC linting, security scans, and performance checks before deployment.
Security rules:
Enforce TLS, mTLS where appropriate, strict IAM roles, encryption at rest and in transit, and secret management with rotation.
Testing rules:
Unit tests for configs, integration tests against mock services, and deployment verification post-change.
Deployment rules:
Prefer canary or blue-green deployments; require health checks and rollback plans.
Human review and escalation rules:
Escalate design or security concerns to the Architecture Review Board or Security Lead. In production, require peer review for any breaking changes.
Failure handling and rollback rules:
On failure, revert to the last known-good state, quarantine failed components, and preserve logs for audit.
Things Agents must not do:
Do not bypass approvals, bypass architecture reviews, store secrets in code, or modify production systems without proper canary deployment and rollback.Overview
Direct answer: This AGENTS.md template codifies a cloud-native system design workflow with multi-agent orchestration, defining roles, handoffs, and governance for both single-agent and collaborative design efforts.
This page provides a complete, copyable AGENTS.md template block and a project-level operating context for AI coding agents working on cloud-native system design. It covers agent roles, supervisor behavior, handoff rules, context memory, tool governance, architecture constraints, file structure, data and API rules, validation, security, testing, deployment, human review, escalation, and failure handling. It enables consistent, auditable behavior across single-agent and multi-agent orchestration patterns.
When to Use This AGENTS.md Template
- When designing cloud-native architectures that involve multiple roles (Planner, Implementer, Reviewer, Researcher, Orchestrator) and cross-team collaboration.
- When you need a reproducible project-level operating context for AI coding agents performing design, IaC, deployment, and testing.
- When explicit agent handoffs and governance are required for tool usage, secrets, and production changes.
- When human review and escalation are part of the workflow to ensure security and reliability.
Copyable AGENTS.md Template
# AGENTS.md
Project Role: Cloud-native Systems Designer (Lead Architect), Platform Engineer, DevOps Engineer, Security Engineer, Observability Engineer.
Agent roster and responsibilities:
- Planner: defines cloud-native design goals, constraints, and delivery plan for the architecture and IaC.
- Implementer: builds infrastructure-as-code, deployment scripts, and configuration for cloud-native services.
- Reviewer: validates architecture decisions, security controls, and performance expectations.
- Researcher: collects patterns, cloud-native design practices, and relevant RFCs or runbooks.
- Orchestrator: coordinates planning, execution, memory, and handoffs; enforces rules and maintains the single source of truth.
Supervisor or orchestrator behavior:
The Orchestrator monitors task progress, sequences steps, and ensures each agent completes its artifact before handing off to the next. It centralizes memory of decisions, sources, and constraints, and triggers escalations when gaps appear or SLAs are missed.
Handoff rules between agents:
When a task completes, the responsible agent writes a handoff note to memory, attaches artifacts, updates the design doc, and notifies the next agent. The orchestrator preserves provenance and ensures the next agent has the required context and access.
Context, memory, and source-of-truth rules:
Context lives in the project design doc and architecture artifacts stored in the repo. Memory is scoped to the current project and persisted in a memory store. Source of truth includes the design doc, infrastructure repo, and artifact store (for example, a versioned object store).
Tool access and permission rules:
Access to cloud tooling (cloud CLI, kubectl, terraform/pulumi, git, secret stores) must follow least-privilege and approval gates. Secrets must never be hard-coded; use a vault or secret manager. Credentials are ephemeral and rotated regularly.
Architecture rules:
Apply cloud-native design patterns: microservices with service mesh, event-driven communication, API gateway, per-service data stores, and observability by default. Ensure idempotent operations and well-defined contracts.
File structure rules:
Maintain a lean, navigable project tree focused on cloud-native design.
infrastructure/
kubernetes/
manifests/
helm-charts/
terraform/
modules/
networks/
security/
observability/
apps/
core-service/
identity-service/
data-service/
ops/
runbooks/
docs/
Data, API, or integration rules when relevant:
Define data contracts, response formats, and integration points; use versioned APIs; surface changes through a contract-first approach.
Validation rules:
Require architecture review sign-off, IaC linting, security scans, and performance checks before deployment.
Security rules:
Enforce TLS, mTLS where appropriate, strict IAM roles, encryption at rest and in transit, and secret management with rotation.
Testing rules:
Unit tests for configs, integration tests against mock services, and deployment verification post-change.
Deployment rules:
Prefer canary or blue-green deployments; require health checks and rollback plans.
Human review and escalation rules:
Escalate design or security concerns to the Architecture Review Board or Security Lead. In production, require peer review for any breaking changes.
Failure handling and rollback rules:
On failure, revert to the last known-good state, quarantine failed components, and preserve logs for audit.
Things Agents must not do:
Do not bypass approvals, bypass architecture reviews, store secrets in code, or modify production systems without proper canary deployment and rollback.
Recommended Agent Operating Model
In cloud-native system design, the team uses a multi-agent operating model with clear roles; Decision boundaries: Planner decides goals; Implementer decides tasks; Reviewer approves; Orchestrator enforces constraints; Researcher surfaces patterns. Escalation: if risk or uncertain design, escalate to human review or Architecture Review Board.
Recommended Project Structure
Workflow-specific directory tree:
infrastructure/
kubernetes/
manifests/
helm-charts/
terraform/
modules/
networks/
security/
observability/
apps/
core-service/
identity-service/
data-service/
ops/
runbooks/
docs/
Core Operating Principles
- Single source of truth for design decisions.
- Least-privilege access to all tools and secrets.
- All changes are auditable with memory and provenance.
- Idempotent, repeatable deployments and rollbacks.
- Explicit human review for production changes.
Agent Handoff and Collaboration Rules
- Planner to Implementer: provide design goals, constraints, and acceptance criteria.
- Implementer to Reviewer: deliver IaC artifacts, deployment plans, and compliance checks.
- Researcher to Planner: surface patterns, references, and risk signals.
- Orchestrator to all agents: coordinate execution, enforce constraints, record decisions.
- If any handoff cannot complete within SLA, escalate to Human Review.
Tool Governance and Permission Rules
- Use ephemeral credentials; do not store secrets in code.
- Secrets management via vault; rotate keys on cadence.
- Approvals required for production changes; automated checks pass before deploy.
- No destructive actions without canary deployments and rollback plans.
Code Construction Rules
- IaC must be versioned and linted; scripts idempotent.
- APIs follow contract-first design; use schema validation.
- Configs must be environment-specific but share common modular patterns.
- All changes require traceable commits with descriptive messages.
Security and Production Rules
- Enforce TLS everywhere; enable mTLS for service-to-service calls.
- Use RBAC with least privilege; review roles frequently.
- Encrypt data at rest and in transit; rotate encryption keys.
- Secrets stored securely; no plaintext in repos or logs.
- Production changes require pre-defined metrics and canary checks.
Testing Checklist
- Unit tests for configuration and scripts.
- Integration tests against mock services or staging environments.
- End-to-end tests for deployment pipelines.
- Canaries and health checks post-deploy.
Common Mistakes to Avoid
- Skipping architecture reviews or security scans.
- Bypassing the memory/source-of-truth rules or handoff discipline.
- Storing secrets in code or logs.
- Ignoring deployment rollback plans.
Related implementation resources: AI Use Case for Content Marketers Using Wordpress To Auto-Translate Blog Posts Into Multiple Languages and AI Use Case for Sales Pipeline Reviews and Deal Risk Scoring.
FAQ
What is the purpose of this AGENTS.md Template for cloud native system design?
It provides a copyable operating context for cloud native architectures, enabling AI coding agents to coordinate in a multi-agent workflow with clear roles, handoffs, and governance.
Who should use this template and how does it support multi-agent orchestration?
Engineers and product teams can paste this block into their AGENTS.md. It defines an agent roster, responsibilities, and a supervisor that orchestrates tasks across planner, implementer, reviewer, and researcher roles.
How are handoffs managed between agents?
Handoffs are triggered by completion signals, documented in memory, and include transfer of context, artifacts, and sources of truth to the next agent. The orchestrator ensures timely progression.
What tool governance and security rules are enforced?
Access to cloud tooling is governed, secrets are not stored in code, and approvals are required before production changes. Credentials are rotated and audited.
What should be validated before deployment?
Design validation, IaC linting, security scanning, policy checks, and post-deploy verifications ensure reliability and compliance.