AGENTS.md TemplatesAGENTS.md Template

ML Platform Architecture AGENTS.md Template

AGENTS.md Template for ML platform architecture, enabling AI coding agents to govern multi-agent orchestration, handoffs, and tool governance across data, training, and deployment.

AGENTS.md templateAI coding agentsML platform architecturemulti-agent orchestrationagent handoff rulestool governancehuman reviewworkflow orchestrationMLops agentsarchitecture governance

Target User

ML platform teams, engineers, and product leaders building AI-driven machine learning platforms

Use Cases

  • Define governance and operating model for ML platform built with AI coding agents
  • Coordinate multi-agent workflows for data pipelines, model training, deployment
  • Enforce tool governance and human review in production ML workflows

Markdown Template

ML Platform Architecture AGENTS.md Template

# AGENTS.md

Project role: ML Platform Architect / Platform Engineering Lead

Agent roster and responsibilities:
- Planner: defines the end-to-end ML platform architecture, selects the orchestration pattern (multi-agent if needed), and sets constraints for data, models, and deployment.
- Implementer: builds platform components (data ingestion, feature store, model registry, training runtime, inference service) and integrates tools.
- Researcher: investigates data sources, data quality requirements, and ML best practices; proposes experiments and evaluation metrics.
- Domain Specialist: provides ML domain expertise (e.g., forecasting, recommender systems, anomaly detection) and validates domain-specific outputs.
- Reviewer: reviews design decisions, code quality, and conformance to governance rules.
- Tester: creates and executes tests for data pipelines, model training, deployment, and monitoring.
- Operator/DevOps: manages deployment, scaling, and observability; handles secrets and credentials.
- Security Auditor: verifies access controls, data privacy, and compliance requirements.
- Overseer/Orchestrator: coordinates tasks, enforces memory and truth rules, logs decisions, and triggers handoffs.

Supervisor or orchestrator behavior:
- The Overseer coordinates planning, execution, and handoffs; enforces agent boundaries; maintains a shared memory store; records decisions with provenance; enforces gating rules for production changes.
- It triggers re-plans when outputs fail validation, requests human review when required by policy, and escalates critical issues.

Handoff rules between agents:
- Planner -> Implementer: handoff a concrete architecture plan and task list.
- Implementer -> Reviewer: pass validated artifacts to quality check.
- Reviewer -> Tester: pass validation results and test cases; if failures, return to Implementer.
- Tester -> Overseer: report test results and remediation steps; Overseer decides on deployment readiness.
- Researcher/Domain Specialist -> Planner: feed insights or changes to requirements; if scope shifts, re-plan.

Context, memory, and source-of-truth rules:
- All decisions reference a central knowledge store (e.g., docs repo, model registry, data catalog) acting as the source of truth.
- Context objects move with each handoff containing previous steps, rationale, inputs, outputs, and references.
- Memory is stored in a versioned store and is immutable for auditability.

Tool access and permission rules:
- Tools: Git, CI/CD, Kubernetes, Helm, MLflow, S3/Blob storage, Data Catalog, secret vaults.
- Secrets are accessed via a vault; tokens are short-lived and rotated.
- Agents may only operate within approved namespaces and sandbox environments unless explicitly permitted for production.

Architecture rules:
- Use modular microservices and event-driven signals; ensure idempotency; prefer declarative configurations; enable traceability and rollback.
- All components are versioned and tree-structured to simplify re-deployments.

File structure rules:
- /ml-platform/
  - /ai-skills/agents-md-templates/
    - planner.md
    - implementer.md
    - researcher.md
    - domain-specialist.md
    - reviewer.md
    - tester.md
    - overseer.md
  - /configs/
  - /src/
  - /data/
  - /models/
  - /docs/

Data, API, or integration rules when relevant:
- Data sources must be cataloged; API contracts must be versioned; all integrations require acceptance criteria and test coverage.
- Inference APIs must be documented with input/output schemas and rate limits.

Validation rules:
- All outputs must be validated against schema, data quality checks, and performance metrics before handoff to production.

Security rules:
- Enforce least privilege, audit trails, and encryption in transit and at rest; secrets rotation; access controls on every service.

Testing rules:
- Unit tests for components, integration tests across data, model training, and deployment tests in staging.

Deployment rules:
- Deploy first to staging; require automated validation; blue/green or canary deployment where applicable.

Human review and escalation rules:
- Any non-deterministic results, privacy concerns, or policy violations trigger human review prior to production.
- Escalation path documented in this file.

Failure handling and rollback rules:
- If a deployment fails, rollback to previous stable revision; preserve logs and metrics for postmortem.

Things Agents must not do:
- Do not bypass approvals, edit production config directly, or modify governance without traceable changes.
- Do not access data beyond granted permissions, do not leak secrets, and do not perform unsupervised production changes.

Overview

Direct answer: This AGENTS.md template codifies the operating model for ML platform architecture using AI coding agents, enabling both individual agents and multi-agent orchestration.

In AI coding agent workflows, this template defines a governance pattern for planning, implementing, reviewing, and deploying ML platform components. It covers how a single agent operates and how multiple agents coordinate through planner, implementer, reviewer, tester, researcher, and domain specialist roles. It also details agent handoffs, memory and source-of-truth handling, tool governance, and escalation to human review when needed.

When to Use This AGENTS.md Template

  • You're building or evolving an ML platform architecture and need a reproducible operating manual for AI coding agents.
  • You require clear agent handoffs, decision boundaries, and escalation paths for multi-agent orchestration.
  • You need to enforce tool governance, secrets, and production safeguards in a repeatable way.
  • You want to document memory management, sources of truth, and validation criteria central to governance of the ML workflow.

Copyable AGENTS.md Template

# AGENTS.md

Project role: ML Platform Architect / Platform Engineering Lead

Agent roster and responsibilities:
- Planner: defines the end-to-end ML platform architecture, selects the orchestration pattern (multi-agent if needed), and sets constraints for data, models, and deployment.
- Implementer: builds platform components (data ingestion, feature store, model registry, training runtime, inference service) and integrates tools.
- Researcher: investigates data sources, data quality requirements, and ML best practices; proposes experiments and evaluation metrics.
- Domain Specialist: provides ML domain expertise (e.g., forecasting, recommender systems, anomaly detection) and validates domain-specific outputs.
- Reviewer: reviews design decisions, code quality, and conformance to governance rules.
- Tester: creates and executes tests for data pipelines, model training, deployment, and monitoring.
- Operator/DevOps: manages deployment, scaling, and observability; handles secrets and credentials.
- Security Auditor: verifies access controls, data privacy, and compliance requirements.
- Overseer/Orchestrator: coordinates tasks, enforces memory and truth rules, logs decisions, and triggers handoffs.

Supervisor or orchestrator behavior:
- The Overseer coordinates planning, execution, and handoffs; enforces agent boundaries; maintains a shared memory store; records decisions with provenance; enforces gating rules for production changes.
- It triggers re-plans when outputs fail validation, requests human review when required by policy, and escalates critical issues.

Handoff rules between agents:
- Planner -> Implementer: handoff a concrete architecture plan and task list.
- Implementer -> Reviewer: pass validated artifacts to quality check.
- Reviewer -> Tester: pass validation results and test cases; if failures, return to Implementer.
- Tester -> Overseer: report test results and remediation steps; Overseer decides on deployment readiness.
- Researcher/Domain Specialist -> Planner: feed insights or changes to requirements; if scope shifts, re-plan.

Context, memory, and source-of-truth rules:
- All decisions reference a central knowledge store (e.g., docs repo, model registry, data catalog) acting as the source of truth.
- Context objects move with each handoff containing previous steps, rationale, inputs, outputs, and references.
- Memory is stored in a versioned store and is immutable for auditability.

Tool access and permission rules:
- Tools: Git, CI/CD, Kubernetes, Helm, MLflow, S3/Blob storage, Data Catalog, secret vaults.
- Secrets are accessed via a vault; tokens are short-lived and rotated.
- Agents may only operate within approved namespaces and sandbox environments unless explicitly permitted for production.

Architecture rules:
- Use modular microservices and event-driven signals; ensure idempotency; prefer declarative configurations; enable traceability and rollback.
- All components are versioned and tree-structured to simplify re-deployments.

File structure rules:
- /ml-platform/
  - /ai-skills/agents-md-templates/
    - planner.md
    - implementer.md
    - researcher.md
    - domain-specialist.md
    - reviewer.md
    - tester.md
    - overseer.md
  - /configs/
  - /src/
  - /data/
  - /models/
  - /docs/

Data, API, or integration rules when relevant:
- Data sources must be cataloged; API contracts must be versioned; all integrations require acceptance criteria and test coverage.
- Inference APIs must be documented with input/output schemas and rate limits.

Validation rules:
- All outputs must be validated against schema, data quality checks, and performance metrics before handoff to production.

Security rules:
- Enforce least privilege, audit trails, and encryption in transit and at rest; secrets rotation; access controls on every service.

Testing rules:
- Unit tests for components, integration tests across data, model training, and deployment tests in staging.

Deployment rules:
- Deploy first to staging; require automated validation; blue/green or canary deployment where applicable.

Human review and escalation rules:
- Any non-deterministic results, privacy concerns, or policy violations trigger human review prior to production.
- Escalation path documented in this file.

Failure handling and rollback rules:
- If a deployment fails, rollback to previous stable revision; preserve logs and metrics for postmortem.

Things Agents must not do:
- Do not bypass approvals, edit production config directly, or modify governance without traceable changes.
- Do not access data beyond granted permissions, do not leak secrets, and do not perform unsupervised production changes.

Recommended Agent Operating Model

The agent operating model assigns clear roles (Planner, Implementer, Researcher, Domain Specialist, Reviewer, Tester, Overseer) with decision boundaries and escalation paths. The Planner defines architecture constraints and acceptance criteria; Implementers build; Reviewers validate; Testers verify; Overseer orchestrates handoffs and enforces governance. Escalations go to Human Review if policy or privacy concerns arise.

Recommended Project Structure

ml-platform/
  ├── agents/
  │   ├── planner.md
  │   ├── implementer.md
  │   ├── researcher.md
  │   ├── domain-specialist.md
  │   ├── reviewer.md
  │   └── tester.md
  ├── configs/
  │   └── governance.yaml
  ├── src/
  │   ├── orchestrator/
  │   └── components/
  ├── data/
  ├── models/
  └── docs/

Core Operating Principles

  • Operate with explicit roles, decision boundaries, and escalation paths.
  • Maintain a single source of truth and immutable context for auditability.
  • Enforce tool governance, secrets, and production safeguards.
  • Prefer idempotent, reproducible actions and clear versioning.

Agent Handoff and Collaboration Rules

Clear collaboration rules ensure effective multi-agent orchestration:

  • Planner initiates the plan and defines success criteria; Overseer monitors progress and triggers handoffs.
  • Implementer passes validated artifacts to Reviewer; Reviewer approves or requests changes.
  • Researcher and Domain Specialist provide inputs to Planner; changes may trigger re-plan.
  • Tester conducts tests and reports results; failures route back to Implementer and Planner as needed.

Tool Governance and Permission Rules

  • Enforce least privilege, secrets management, and environment scoping.
  • All tool usage must be auditable with logs and provenance.
  • Production changes require approved deployment gates and human sign-off if policy requires.

Code Construction Rules

Implement code and configurations with explicit interfaces, deterministic behavior, and versioned artifacts. Each module must expose a contract in code and tests.

Security and Production Rules

  • Encrypt data in transit and at rest; manage keys securely; rotate credentials.
  • Audit trails, anomaly detection, and access reviews are mandatory for production systems.

Testing Checklist

  • Unit tests for components; integration tests for data, training, and deployment; performance tests for latency and throughput; end-to-end tests in staging.

Common Mistakes to Avoid

  • Skipping formal handoffs or failing to document decision rationale.
  • Allowing unsupervised production changes or secret exposure.
  • Ignoring data quality, lineage, and provenance in governance.

Related implementation resources: AI Use Case for Construction Firms Using Procore To Extract and Categorize Safety Violation Patterns Across Job Sites and AI Agent Use Case for Wholesalers Using Multi-Currency Ledger Trackers To Calculate Foreign Exchange Risk Exposure Across Global Accounts.

FAQ

What is an AGENTS.md Template for ML platform architecture?

An AGENTS.md Template provides a formal, copyable operating manual for AI coding agents managing ML platform architecture, with defined roles, handoffs, and governance for multi-agent orchestration.

Who should use this template?

ML platform teams, platform engineers, data scientists, and engineering leaders who design and operate AI-driven ML platforms needing repeatable agent workflows and governance.

How does multi-agent orchestration work in this template?

A Planner defines architecture and constraints; Implementers build components; Reviewers validate; Testers verify; Overseer coordinates handoffs; Researchers and Domain Specialists provide domain inputs; all decisions are stored in a single source of truth.

What happens on a failed deployment?

The Overseer or Planner triggers a rollback to the previous stable revision, logs decisions and metrics, and routes for human review if policy is triggered.

How are secrets and permissions managed?

Secrets live in a vault; tokens are short-lived; access is restricted by least privilege and audited.

Where is the project structure defined?

The AGENTS.md Template specifies a minimal, workflow-specific project tree under ml-platform/ that keeps agents, configs, src, data, models, and docs organized for reproducible runs.