Azure Production System Design AGENTS.md Template
AGENTS.md Template for Azure production system design, enabling AI coding agents to govern Azure infra, multi-agent orchestration, handoffs, and governance.
Target User
Developers, platform engineers, SREs, and engineering leaders building Azure-based production systems with AI agents
Use Cases
- Azure production system design
- multi-agent orchestration
- agent handoffs and escalation
- tool governance and secrets management
- cloud infrastructure automation
Markdown Template
Azure Production System Design AGENTS.md Template
# AGENTS.md
Project: Azure Production System Design
Project role:
- Architect: defines the Azure target state and verifies alignment with business outcomes
- Planner: builds an actionable plan for Azure resources and CI/CD pipelines
- Implementer: translates plans into Azure resources and automation scripts
- Reviewer: validates design, security posture, and compliance
- Researcher: identifies constraints, dependencies, and alternative patterns
- Domain Specialist: provides Azure-specific domain knowledge (Networking, Identity, Security)
Supervisor / Orchestrator:
- Orchestrator coordinates prompts, enforces handoffs, records decisions to the source of truth, and triggers validation gates.
Handoff rules:
- Planner ➜ Implementer on plan acceptance
- Implementer ➜ Reviewer upon completion of implementation
- Reviewer ➜ Implementer for fixes, then back to Reviewer until sign-off
- Any critical risk triggers escalation to human review and pause in production changes
Context, memory, and source of truth:
- All decisions, designs, and artifacts are stored in the repository as the single source of truth
- Memory includes current Azure resource inventory, desired state, and policy constraints
- References to external docs must be linked to internal knowledge bases
Tool access and permission rules:
- Access to Azure CLI, ARM/Bicep templates, Terraform where allowed, and Azure REST APIs is restricted by role-based access control
- Secrets must only be retrieved from Azure Key Vault and never logged
Architecture rules:
- All Azure resources must be described in a formal architecture diagram and ARM/Bicep templates where applicable
- Use established resource naming conventions and tagging policies
File structure rules:
- Keep architecture-specific files under infrastructure/azure and docs/architecture
- Do not place production secrets in code or docs
Data, API, or integration rules:
- Use managed identities and secure endpoints
- Encrypt data in transit and at rest where applicable
Validation rules:
- Design validated by Planner and Confirmed by Reviewer
- Automate checks for drift, policy violations, and security controls
Security rules:
- Secrets must be stored in Key Vault
- Production changes require approval gates
- Least privilege for all agents
Testing rules:
- Unit tests for scripts, integration tests for deployment workflows, end-to-end tests for recovery scenarios
Deployment rules:
- Use CI/CD gates for deployment to Azure environments
- Rollback procedures must exist and be tested
Human review and escalation rules:
- Trigger human review on high-risk changes or failed validations
- Escalations must include rationale and next steps
Failure handling and rollback rules:
- Restore to last known good configuration on failure
- Log all failures with context for post-mortem
Things Agents must not do:
- Do not bypass approval gates
- Do not reveal secrets in logs
- Do not modify production state without consensus and rollback planOverview
This AGENTS.md Template defines a project-level operating manual for Azure production system design using AI coding agents. It governs both single-agent workflows and multi-agent orchestration across provisioning, deployment, monitoring, and governance in Azure, with explicit handoffs, context storage, and tool governance.
Direct answer: Use this AGENTS.md Template to establish a repeatable, auditable working contract for Azure production systems, ensuring safe collaboration among planners, implementers, reviewers, researchers, and domain specialists.
When to Use This AGENTS.md Template
- You are designing or maintaining Azure production environments that require repeatable agent-driven workflows.
- You need explicit handoffs and escalation paths between planners, implementers, reviewers, and testers.
- You must enforce tool governance, secret management, and secure deployment practices in the cloud.
- You want a single source of truth for architecture decisions, validation criteria, and rollback procedures.
Copyable AGENTS.md Template
# AGENTS.md
Project: Azure Production System Design
Project role:
- Architect: defines the Azure target state and verifies alignment with business outcomes
- Planner: builds an actionable plan for Azure resources and CI/CD pipelines
- Implementer: translates plans into Azure resources and automation scripts
- Reviewer: validates design, security posture, and compliance
- Researcher: identifies constraints, dependencies, and alternative patterns
- Domain Specialist: provides Azure-specific domain knowledge (Networking, Identity, Security)
Supervisor / Orchestrator:
- Orchestrator coordinates prompts, enforces handoffs, records decisions to the source of truth, and triggers validation gates.
Handoff rules:
- Planner ➜ Implementer on plan acceptance
- Implementer ➜ Reviewer upon completion of implementation
- Reviewer ➜ Implementer for fixes, then back to Reviewer until sign-off
- Any critical risk triggers escalation to human review and pause in production changes
Context, memory, and source of truth:
- All decisions, designs, and artifacts are stored in the repository as the single source of truth
- Memory includes current Azure resource inventory, desired state, and policy constraints
- References to external docs must be linked to internal knowledge bases
Tool access and permission rules:
- Access to Azure CLI, ARM/Bicep templates, Terraform where allowed, and Azure REST APIs is restricted by role-based access control
- Secrets must only be retrieved from Azure Key Vault and never logged
Architecture rules:
- All Azure resources must be described in a formal architecture diagram and ARM/Bicep templates where applicable
- Use established resource naming conventions and tagging policies
File structure rules:
- Keep architecture-specific files under infrastructure/azure and docs/architecture
- Do not place production secrets in code or docs
Data, API, or integration rules:
- Use managed identities and secure endpoints
- Encrypt data in transit and at rest where applicable
Validation rules:
- Design validated by Planner and Confirmed by Reviewer
- Automate checks for drift, policy violations, and security controls
Security rules:
- Secrets must be stored in Key Vault
- Production changes require approval gates
- Least privilege for all agents
Testing rules:
- Unit tests for scripts, integration tests for deployment workflows, end-to-end tests for recovery scenarios
Deployment rules:
- Use CI/CD gates for deployment to Azure environments
- Rollback procedures must exist and be tested
Human review and escalation rules:
- Trigger human review on high-risk changes or failed validations
- Escalations must include rationale and next steps
Failure handling and rollback rules:
- Restore to last known good configuration on failure
- Log all failures with context for post-mortem
Things Agents must not do:
- Do not bypass approval gates
- Do not reveal secrets in logs
- Do not modify production state without consensus and rollback plan
Recommended Agent Operating Model
Roles and boundaries for Azure production system design:
- Planner: defines the target Azure design, dependencies, and the sequence of actions; decides handoff points.
- Implementer: provisions resources, writes automation scripts, applies templates, and ensures alignment with the plan.
- Reviewer: performs architecture, security, and compliance reviews; validates against policies and risk thresholds.
- Researcher: probes Azure constraints, costs, and alternate patterns; surfaces trade-offs for decision-makers.
- Domain Specialist: provides Azure-specific expertise (Networking, Identity, Security, Governance) and validates domain requirements.
- Supervisor/Orchestrator: enforces process, coordinates handoffs, curates memory, and enacts validation gates.
Recommended Project Structure
azure-prod-design/
├── agents/
│ ├── planner.md
│ ├── implementer.md
│ ├── reviewer.md
│ ├── researcher.md
│ └── domain-specialist.md
├── infrastructure/
│ └── azure/
│ ├── templates/
│ │ ├── main.bicep
│ │ └── main.parameters.json
│ └── scripts/
├── configs/
├── policies/
├── docs/
│ └── architecture-diagrams/
├── tests/
│ └── e2e/
└── .github/
└── workflows/
Core Operating Principles
- Single source of truth for all Azure designs and agent decisions.
- Deterministic outputs: identical plans yield identical results when inputs are the same.
- Security by design: least privilege, secrets vaulting, and auditable changes.
- Clear ownership and escalation paths for all actions.
- Explicit handoffs with validation gates before progression.
Agent Handoff and Collaboration Rules
Rules by role for Azure workflows:
- Planner → Implementer: hand off after plan approval with a unique plan identifier and resource graph.
- Implementer → Reviewer: hand off on completion of resource provisioning and script execution; include logs and artifacts.
- Reviewer → Implementer: require fixes and re-run validations if any security or architecture issues are found.
- Researcher → Planner: when new constraints or costs surface, update the plan and reconsider trade-offs.
- Domain Specialist → all: provide domain-specific reviews before production deployment.
Tool Governance and Permission Rules
- Azure CLI and ARM/Bicep usage must reference approved templates and parameter sets.
- All secrets must be retrieved from Azure Key Vault; never logged or exposed in output.
- Production changes require an approval gate and a rollback plan.
- Access to production environments is controlled by the orchestrator and RBAC roles.
- All API calls to Azure resources must be audited and traceable.
Code Construction Rules
- Templates must be idempotent and support drift detection.
- All changes must be versioned and described in PR messages tied to the AGENTS.md template.
- Environment-specific values must be parameterized and stored in secure configurations.
- Scripts must be deterministic and include validation hooks after execution.
Security and Production Rules
- Use managed identities for resource access; never embed credentials.
- Enforce network security groups, firewall rules, and private endpoints where applicable.
- Implement resource tagging for governance and cost tracking.
- Establish access review cadence and automated alerting for critical changes.
Testing Checklist
- Unit tests for each script/module; mock Azure calls where possible.
- Integration tests for ARM/Bicep templates and resource provisioning sequences.
- End-to-end tests for deployment, rollback, and validation gates.
- Security tests for secrets handling and IAM permissions.
Common Mistakes to Avoid
- Skipping validation gates or skipping human review for production changes.
- Hard-coding secrets or credentials in scripts or templates.
- Unclear handoff boundaries leading to duplicated work or drift.
- Ignoring cost implications or Azure policy constraints in design decisions.
Related implementation resources: AI Use Case for Corporate Event Managers Using Slack To Orchestrate Day-Of Venue Tasks Across Multi-Department Teams and AI Agent Use Case for Wholesalers Using Multi-Currency Ledger Trackers To Calculate Foreign Exchange Risk Exposure Across Global Accounts.
FAQ
What is the purpose of this AGENTS.md Template for Azure?
It provides a repeatable, auditable operating manual for Azure production system design using AI coding agents, enabling multi-agent orchestration and clear handoffs.
How does multi-agent orchestration work in this template?
Agents collaborate via a planner-driven plan, with explicit handoffs and validation gates enforced by the orchestrator to prevent drift and ensure compliance.
Where should this template be stored?
In the repository under infrastructure/design/ and referenced by the AGENTS.md as the project operating manual; all artifacts contribute to the single source of truth.
What if a deployment fails?
Follow the rollback procedure outlined in the AGENTS.md, re-run validations, and revert to the last known good configuration before attempting a retry.
How are secrets handled?
Secrets are retrieved from Azure Key Vault at runtime; do not log or store secrets in code or output logs.
When is human review required?
For high-risk changes, security policy exceptions, or any failure in automated validation gates, trigger human review and pause production changes.