AGENTS.md TemplatesTemplate

AGENTS.md Template: Two-Phase Commit Evaluation

AGENTS.md Template for a two-phase commit evaluation workflow in AI coding agents, enabling secure multi-agent orchestration with explicit handoffs, memory, and governance.

AGENTS.md templatetwo-phase commitAI coding agentsmulti-agent orchestrationagent handoff rulestool governancehuman reviewmemory and truthsecurity rulestesting checklistproject structure

Target User

Developers, founders, product teams, engineering leaders

Use Cases

  • Two-phase commit evaluation pattern for AI coding agents
  • Agent handoffs and orchestration in multi-agent workflows
  • Governance, memory, and truth sources across agent phases
  • Auditable, reproducible agent workflows

Markdown Template

AGENTS.md Template: Two-Phase Commit Evaluation

# AGENTS.md

Project role: Two-Phase Commit Evaluation Engine for AI coding agents.

Agent roster and responsibilities:
- Planner: proposes actions, constraints, and writes a plan.md; ensures the plan adheres to two-phase commit rules.
- Evaluator: validates the planner's plan, runs cross-checks, and returns an evaluation report.
- Implementer: executes agreed actions in the codebase or tooling.
- Reviewer: reviews changes and approves before merging or deploying.
- Tester: runs unit and integration tests to validate outcomes.
- Domain Specialist: provides domain-specific validation and guidance.

Supervisor or orchestrator behavior:
- The Orchestrator coordinates phase transitions, enforces memory updates, and enforces sources of truth.
- It triggers handoffs, logs decisions, and escalates when timeouts or conflicts occur.

Handoff rules between agents:
- Planner to Evaluator when a plan.md is ready for evaluation.
- Evaluator to Implementer when evaluation passes and actions are approved.
- Implementer to Reviewer when changes are ready for review.
- Reviewer to Tester when merged changes require validation.
- Domain Specialist can intervene at any phase for domain-specific checks or to halt and request replanning.

Context, memory, and source-of-truth rules:
- Maintain a central memory store (memory/index.json) containing plan, evaluation, actions, and test results.
- Sources of truth include the repository, test suites, and external docs referenced by the plan.
- All outputs must be traceable to the originating agent and timestamped.

Tool access and permission rules:
- Tools are accessed only through approved adapters in tools/ (api_clients, shell).
- Secrets must be stored securely and never hard-coded; use vaults or memory stores with access controls.
- API calls and edits are gated by the orchestrator and require authorization gates.

Architecture rules:
- Single orchestrator model coordinating planner, evaluator, implementer, reviewer, tester, and domain specialist roles.
- Changes should be applied via PRs or controlled deploys; avoid direct production edits by agents.

File structure rules:
- Only workflow-relevant directories: agents/, memory/, tools/, workflows/, tests/, docs/, and an AGENTS.md at the root.
- Do not include irrelevant folders or technologies.

Data, API, or integration rules:
- All integrations must expose deterministic outputs; non-deterministic calls must be simulated in tests where possible.
- Maintain backward-compatible interfaces across phases.

Validation rules:
- Each phase must produce verifiable artifacts (plan.md, evaluation reports, test results).
- The orchestrator enforces an explicit acceptance condition before moving phases.

Security rules:
- Do not leak secrets; restrict tool access to the minimum permission set required for the task.
- Audit logs must capture phase transitions and approvals.

Testing rules:
- Include unit tests for planners and evaluators, integration tests for the two-phase flow, and end-to-end tests for common scenarios.

Deployment rules:
- Deploy the orchestrator and agents as a controlled workflow; use feature flags for experimental changes.

Human review and escalation rules:
- Escalate to the engineering lead if an evaluation cannot be resolved within the SLA or if a risk is detected.

Failure handling and rollback rules:
- If a phase fails, revert changes from the Implementer backed by a known-good memory snapshot; replan if necessary.

Things Agents must not do:
- Do not bypass the evaluator; do not perform production edits without a reviewer.
- Do not mutate memory without proper provenance in the source-of-truth artifacts.

Overview

Direct answer: This AGENTS.md template defines a two-phase commit evaluation workflow for AI coding agents, enabling multi-agent coordination with explicit handoffs, memory, and governance to ensure safe, auditable commits.

The AGENTS.md template documents the job roles, data sources, and governance required to run a two-phase commit evaluation across a Planner/Domain, Evaluator, and Implementer. It supports both single-agent execution and multi-agent orchestration, with clear memory, truth sources, and escalation paths to maintain traceability in code changes and API interactions.

When to Use This AGENTS.md Template

  • When you need deterministic, auditable decisions across a two-phase commit workflow involving planning and validation before changes are applied.
  • When tool access, secrets, and external services require strict governance and approvals.
  • When you require explicit handoffs, centralized memory, and a single source of truth across phases.
  • When tooling touches critical code, data migrations, or external APIs and you need human review as part of the flow.

Copyable AGENTS.md Template

# AGENTS.md

Project role: Two-Phase Commit Evaluation Engine for AI coding agents.

Agent roster and responsibilities:
- Planner: proposes actions, constraints, and writes a plan.md; ensures the plan adheres to two-phase commit rules.
- Evaluator: validates the planner's plan, runs cross-checks, and returns an evaluation report.
- Implementer: executes agreed actions in the codebase or tooling.
- Reviewer: reviews changes and approves before merging or deploying.
- Tester: runs unit and integration tests to validate outcomes.
- Domain Specialist: provides domain-specific validation and guidance.

Supervisor or orchestrator behavior:
- The Orchestrator coordinates phase transitions, enforces memory updates, and enforces sources of truth.
- It triggers handoffs, logs decisions, and escalates when timeouts or conflicts occur.

Handoff rules between agents:
- Planner to Evaluator when a plan.md is ready for evaluation.
- Evaluator to Implementer when evaluation passes and actions are approved.
- Implementer to Reviewer when changes are ready for review.
- Reviewer to Tester when merged changes require validation.
- Domain Specialist can intervene at any phase for domain-specific checks or to halt and request replanning.

Context, memory, and source-of-truth rules:
- Maintain a central memory store (memory/index.json) containing plan, evaluation, actions, and test results.
- Sources of truth include the repository, test suites, and external docs referenced by the plan.
- All outputs must be traceable to the originating agent and timestamped.

Tool access and permission rules:
- Tools are accessed only through approved adapters in tools/ (api_clients, shell).
- Secrets must be stored securely and never hard-coded; use vaults or memory stores with access controls.
- API calls and edits are gated by the orchestrator and require authorization gates.

Architecture rules:
- Single orchestrator model coordinating planner, evaluator, implementer, reviewer, tester, and domain specialist roles.
- Changes should be applied via PRs or controlled deploys; avoid direct production edits by agents.

File structure rules:
- Only workflow-relevant directories: agents/, memory/, tools/, workflows/, tests/, docs/, and an AGENTS.md at the root.
- Do not include irrelevant folders or technologies.

Data, API, or integration rules:
- All integrations must expose deterministic outputs; non-deterministic calls must be simulated in tests where possible.
- Maintain backward-compatible interfaces across phases.

Validation rules:
- Each phase must produce verifiable artifacts (plan.md, evaluation reports, test results).
- The orchestrator enforces an explicit acceptance condition before moving phases.

Security rules:
- Do not leak secrets; restrict tool access to the minimum permission set required for the task.
- Audit logs must capture phase transitions and approvals.

Testing rules:
- Include unit tests for planners and evaluators, integration tests for the two-phase flow, and end-to-end tests for common scenarios.

Deployment rules:
- Deploy the orchestrator and agents as a controlled workflow; use feature flags for experimental changes.

Human review and escalation rules:
- Escalate to the engineering lead if an evaluation cannot be resolved within the SLA or if a risk is detected.

Failure handling and rollback rules:
- If a phase fails, revert changes from the Implementer backed by a known-good memory snapshot; replan if necessary.

Things Agents must not do:
- Do not bypass the evaluator; do not perform production edits without a reviewer.
- Do not mutate memory without proper provenance in the source-of-truth artifacts.

Recommended Agent Operating Model

The operating model assigns clear boundaries: Planner defines intent and constraints; Evaluator validates feasibility and safety; Implementer performs changes with guardrails; Reviewer and Tester ensure quality; Domain Specialist provides expertise. Escalation paths ensure timely decisions when conflicts arise. Handoff rules enforce strict phase transitions to prevent drift.

Recommended Project Structure

ai-two-phase-commit/
├── agents/
│   ├── planner/
│   │   └── plan.md
│   ├── evaluator/
│   │   └── evaluate.md
│   ├── implementer/
│   │   └── apply_changes.md
│   ├── reviewer/
│   │   └── review.md
│   ├── tester/
│   │   └── test.md
│   └── domain/
│       └── domain_checks.md
├── memory/
│   └── index.json
├── tools/
│   ├── api_clients/
│   └── shell/
├── workflows/
│   └── two_phase_commit.yaml
└── AGENTS.md

Core Operating Principles

  • Operate with explicit phase boundaries and auditable decisions.
  • Keep memory and truth sources centralized and tamper-evident.
  • Require human review for changes that affect production or data integrity.
  • Enforce least-privilege tool access and strict approval gates.
  • Document all handoffs and rationale for traceability.

Agent Handoff and Collaboration Rules

  • Planner ➜ Evaluator: share plan.md and constraints; evaluator returns evaluation and acceptance criteria.
  • Evaluator ➜ Implementer: if accepted, provide concrete actions; implementer executes changes.
  • Implementer ➜ Reviewer: submit changes for review with test results and docs.
  • Reviewer ➜ Tester: after approval, tester validates with test suites.
  • Domain Specialist: may insert checks at any phase and request replanning if domain risk is detected.

Tool Governance and Permission Rules

  • Only approved adapters in tools/ may be used; all calls are logged.
  • Secrets are stored securely; never exposed in code or logs.
  • Production system edits require review, approval, and e2e validation in a staging environment.
  • Handoff must include a rationale and evidence trail.

Code Construction Rules

  • Follow deterministic patterns; avoid non-deterministic APIs in core flows.
  • Keep changes small, well-scoped, and reversible.
  • Documentation must accompany every change set.
  • All changes must be traceable to a plan.md and evaluation record.

Security and Production Rules

  • Enforce role-based access and secrets management; rotate credentials regularly.
  • Audit trails for all phase transitions and human approvals.
  • Do not deploy experimental changes to production without a controlled release plan.

Testing Checklist

  • Unit tests for planner, evaluator, and implementer logic.
  • Integration tests for the two-phase commit flow, including handoffs.
  • End-to-end tests in a staging environment with representative data.

Common Mistakes to Avoid

  • Skipping evaluator validation or bypassing the review gate.
  • Allowing memory drift or unsanctioned changes to truth sources.
  • Unclear handoffs leading to phase bleed or duplicate efforts.

Related implementation resources: AI Use Case for Sales Pipeline Reviews and Deal Risk Scoring and AI Use Case for Corporate Event Managers Using Slack To Orchestrate Day-Of Venue Tasks Across Multi-Department Teams.

FAQ

What is the purpose of this AGENTS.md Template?

It formalizes a two-phase commit evaluation workflow for AI coding agents, ensuring coordinated planning, validation, and controlled changes.

Which agents participate in the two-phase commit evaluation?

Planner, Evaluator, Implementer, Reviewer, Tester, and Domain Specialist, with a central Orchestrator coordinating phases.

How are handoffs enforced between phases?

Handoffs occur only when acceptance criteria are met, with explicit artifacts (plan.md, evaluation report, tests) accompanying each transition.

How are data and memory managed across agents?

A central memory store (memory/index.json) tracks plan, evaluation, actions, and results; sources of truth are the repository, tests, and referenced docs.

What constitutes a successful commit in this workflow?

Successful commit means all phases completed, approvals granted, tests passed, and the change deployed to production under a controlled release plan.