AGENTS.md TemplatesAGENTS.md Template for tool calling governance

Tool-Calling Governance AGENTS.md Template

AGENTS.md template for tool calling governance agents, enabling single-agent and multi-agent orchestration with explicit handoffs, tool governance, and human review.

AGENTS.md Templatetool calling governanceAI coding agentsmulti-agent orchestrationagent handoff rulestool governancehuman reviewworkflow

Target User

Developers, AI teams, engineering leaders

Use Cases

  • Tool calling governance
  • Multi-agent orchestration
  • Agent handoffs
  • Tool access governance
  • Human-in-the-loop review

Markdown Template

Tool-Calling Governance AGENTS.md Template

# AGENTS.md

Project role
- Tool-Calling Governance Project Lead: oversees policy, risk, and orchestration across agents.
- AI Engineering Lead: ensures correct prompt design, tool interfaces, and integration patterns.
- Security and Compliance Owner: enforces secrets handling, audits, and approvals.

Agent roster and responsibilities
- Planner: defines the plan for tool usage, identifies required tools, and sequences steps.
- Implementer: implements tool calls, integrates tool outputs, and updates state.
- Reviewer: validates outputs, detects drift, and ensures alignment with governance rules.
- Tester: executes unit and integration tests for tool interactions and prompts.
- Researcher: gathers tool capability data, API contracts, and latency expectations.
- Domain Specialist: provides expert input for domain-specific tool usage and constraints.

Supervisor or orchestrator behavior
- The Orchestrator maintains the overall plan, enforces policies, routes handoffs, and records decisions.
- All tool calls must go through the orchestrator with a traceable task id and state machine.
- If a tool call fails, the orchestrator triggers escalation and rollback rules.

Handoff rules between agents
- Handoff occurs only at clearly defined decision points or after tool outputs are produced.
- The Planner initiates handoffs with a documented plan summary and expected results.
- The Implementer passes the completed tool interaction to the Reviewer with traces and sources.
- The Reviewer may loop back to Planner or Initiator if drift or risk is detected.

Context, memory, and source-of-truth rules
- Context is stored in a memory store keyed by task-id and persists for the workflow duration.
- The Source of Truth is a central contracts/data store that all agents can read but only write through controlled updates.
- Do not rely on ephemeral local memory for long-running tool interactions.

Tool access and permission rules
- Tools must be registered in the central catalog with scoped permissions.
- Secrets and tokens must be rotated on fixed schedules and retrieved through a secrets manager.
- Agents may execute only allowed tool types and endpoints; any deviation requires human approval.

Architecture rules
- Use a modular architecture with clear interfaces: Planner, Implementer, Reviewer, Tester, Researcher, Domain Specialist, and Orchestrator.
- All tool outputs must be validated against schemas before being stored or acted upon.
- Prefer idempotent operations and deterministic outputs.

File structure rules
- Place AGENTS.md at the project root.
- Use dedicated directories per role under agents/ and a singular orchestrator/ for governance rules.
- Keep runbooks, policies, and tool catalog under policies/ or runbooks/ as appropriate.

Data, API, or integration rules when relevant
- Define data contracts for tool responses, including fields for status, data, and errors.
- Validate API responses against schemas before interpretation.
- Log all calls with timestamps, tool, inputs, outputs, and outcomes.

Validation rules
- Validate all prompts and tool outputs against defined criteria before use.
- Confirm outputs meet acceptance criteria and do not drift from policy.
- Reconcile discrepancies with manual review when necessary.

Security rules
- Enforce least privilege on all tool access.
- Rotate secrets, audit access, and require approvals for sensitive actions.
- Do not store plaintext secrets in memory; always fetch from a secrets vault.

Testing rules
- Unit tests for prompt templates and tool integration points.
- Integration tests that simulate real tool calls with mock services.
- End-to-end tests for common workflows with expected outputs.

Deployment rules
- Gate changes through CI/CD with review by the Reviewer role.
- Roll out in small batches with monitoring and automatic rollback on errors.
- Maintain a changelog for governance policy updates.

Human review and escalation rules
- Escalate any high-risk tool interaction to Security Owner and Domain Specialist.
- Use a Human-in-the-Loop check before production tool calls when dealing with critical data.

Failure handling and rollback rules
- On failure, revert to the last known-good state and notify stakeholders.
- Maintain a rollback plan for each tool call and ensure state consistency.

Things Agents must not do
- Do not bypass the orchestrator or governance checks.
- Do not share secrets in prompts or outputs.
- Do not perform unsupervised production changes.
- Do not drift from the approved data contracts and tool contracts.

Overview

This AGENTS.md template is designed for tool calling governance agents and governs both single-agent workflows and multi-agent orchestration. It provides a concrete operating manual that clarifies roles, handoffs, tool governance, and human review to reduce risk and improve tractability when AI coding agents call external tools.

When to Use This AGENTS.md Template

  • When introducing tool calling patterns that require explicit governance and approval gates.
  • When coordinating multiple agents with defined handoffs and memory rules.
  • When building a repeatable, auditable workflow for AI tooling interactions and API calls.
  • When you need a single source of truth for roles, responsibilities, and escalation paths.

Copyable AGENTS.md Template

Paste the following block into your project root to establish a formal AGENTS.md operating context. It is long intentionally to cover governance, orchestration, and compliance aspects for tool calling workflows.

# AGENTS.md

Project role
- Tool-Calling Governance Project Lead: oversees policy, risk, and orchestration across agents.
- AI Engineering Lead: ensures correct prompt design, tool interfaces, and integration patterns.
- Security and Compliance Owner: enforces secrets handling, audits, and approvals.

Agent roster and responsibilities
- Planner: defines the plan for tool usage, identifies required tools, and sequences steps.
- Implementer: implements tool calls, integrates tool outputs, and updates state.
- Reviewer: validates outputs, detects drift, and ensures alignment with governance rules.
- Tester: executes unit and integration tests for tool interactions and prompts.
- Researcher: gathers tool capability data, API contracts, and latency expectations.
- Domain Specialist: provides expert input for domain-specific tool usage and constraints.

Supervisor or orchestrator behavior
- The Orchestrator maintains the overall plan, enforces policies, routes handoffs, and records decisions.
- All tool calls must go through the orchestrator with a traceable task id and state machine.
- If a tool call fails, the orchestrator triggers escalation and rollback rules.

Handoff rules between agents
- Handoff occurs only at clearly defined decision points or after tool outputs are produced.
- The Planner initiates handoffs with a documented plan summary and expected results.
- The Implementer passes the completed tool interaction to the Reviewer with traces and sources.
- The Reviewer may loop back to Planner or Initiator if drift or risk is detected.

Context, memory, and source-of-truth rules
- Context is stored in a memory store keyed by task-id and persists for the workflow duration.
- The Source of Truth is a central contracts/data store that all agents can read but only write through controlled updates.
- Do not rely on ephemeral local memory for long-running tool interactions.

Tool access and permission rules
- Tools must be registered in the central catalog with scoped permissions.
- Secrets and tokens must be rotated on fixed schedules and retrieved through a secrets manager.
- Agents may execute only allowed tool types and endpoints; any deviation requires human approval.

Architecture rules
- Use a modular architecture with clear interfaces: Planner, Implementer, Reviewer, Tester, Researcher, Domain Specialist, and Orchestrator.
- All tool outputs must be validated against schemas before being stored or acted upon.
- Prefer idempotent operations and deterministic outputs.

File structure rules
- Place AGENTS.md at the project root.
- Use dedicated directories per role under agents/ and a singular orchestrator/ for governance rules.
- Keep runbooks, policies, and tool catalog under policies/ or runbooks/ as appropriate.

Data, API, or integration rules when relevant
- Define data contracts for tool responses, including fields for status, data, and errors.
- Validate API responses against schemas before interpretation.
- Log all calls with timestamps, tool, inputs, outputs, and outcomes.

Validation rules
- Validate all prompts and tool outputs against defined criteria before use.
- Confirm outputs meet acceptance criteria and do not drift from policy.
- Reconcile discrepancies with manual review when necessary.

Security rules
- Enforce least privilege on all tool access.
- Rotate secrets, audit access, and require approvals for sensitive actions.
- Do not store plaintext secrets in memory; always fetch from a secrets vault.

Testing rules
- Unit tests for prompt templates and tool integration points.
- Integration tests that simulate real tool calls with mock services.
- End-to-end tests for common workflows with expected outputs.

Deployment rules
- Gate changes through CI/CD with review by the Reviewer role.
- Roll out in small batches with monitoring and automatic rollback on errors.
- Maintain a changelog for governance policy updates.

Human review and escalation rules
- Escalate any high-risk tool interaction to Security Owner and Domain Specialist.
- Use a Human-in-the-Loop check before production tool calls when dealing with critical data.

Failure handling and rollback rules
- On failure, revert to the last known-good state and notify stakeholders.
- Maintain a rollback plan for each tool call and ensure state consistency.

Things Agents must not do
- Do not bypass the orchestrator or governance checks.
- Do not share secrets in prompts or outputs.
- Do not perform unsupervised production changes.
- Do not drift from the approved data contracts and tool contracts.

Recommended Agent Operating Model

The recommended operating model defines agent roles, decision boundaries, and escalation paths for tool calling governance. It ensures clear separation of duties and robust collaboration patterns among planners, implementers, reviewers, testers, researchers, and domain specialists. Handoffs are explicit, and the orchestrator enforces policy, auditing, and traceability.

Recommended Project Structure

tool-calling-governance/
  agents/
    planner/
      prompts/
      policies/
    implementer/
    reviewer/
    tester/
    researcher/
    domain-specialist/
  orchestrator/
    governance/
  tools/
    catalog.json
  policies/
  runbooks/
  data/
    contracts/
  configs/
  tests/
  docs/

Core Operating Principles

  • Clarity: every decision point and handoff must be documented.
  • Guardrails: enforce tool governance and least privilege for all actions.
  • Traceability: maintain end-to-end audit trails for all tool interactions.
  • Determinism: prefer idempotent actions and predictable outcomes.
  • Human-in-the-loop: escalate critical actions to humans when appropriate.

Agent Handoff and Collaboration Rules

  • Planner to Implementer: pass plan summary, required tools, and initial inputs.
  • Implementer to Reviewer: provide outputs, tool traces, and data contracts for validation.
  • Reviewer to Planner: request clarifications or approve changes to policy or tool usage.
  • Researcher and Domain Specialist: provide context for specialized tool usage and constraints as needed.

Tool Governance and Permission Rules

  • Tool calls must go through orchestrator with validated inputs.
  • Secrets must be retrieved from a vault; never hard-code credentials.
  • All tool outputs must be validated against schemas before use.
  • Change approvals required for high-impact tool actions.

Code Construction Rules

  • Prompts must be modular, versioned, and tested against failure modes.
  • Tool integration code must be idempotent and auditable.
  • Promote reuse of shared utilities; avoid hard-coded values.

Security and Production Rules

  • Enforce least privilege and secret rotation for all tools.
  • Maintain an immutable audit log for tool interactions.
  • Require human confirmation for production-impact actions.

Testing Checklist

  • Unit tests for prompts and tool adapters.
  • Integration tests for end-to-end tool calls with mocks.
  • End-to-end tests for common governance workflows.
  • Security tests for secret handling and access controls.

Common Mistakes to Avoid

  • Skipping governance checks or bypassing the orchestrator.
  • Exposing secrets in prompts or outputs.
  • Unbounded memory growth or context drift without updates to the memory store.
  • Inconsistent data contracts or tool schemas across integrations.

FAQ

What is the purpose of this AGENTS.md Template for tool calling governance?

To define operating context, roles, handoffs, and governance rules for AI coding agents that call tools under governance constraints.

Who should use this AGENTS.md Template?

Developers, founders, and engineering leaders implementing tool governance patterns and multi-agent orchestration.

Can this template be used for both single-agent and multi-agent workflows?

Yes; it provides core operating principles and handoff rules adaptable to either single-agent or multi-agent orchestration.

What are the key security considerations in this template?

Least privilege tool access, secret management, audit trails, and human-in-the-loop reviews for critical actions.

How do we handle tool latency or failure within the template's procedures?

Define failure handling and rollback rules, with graceful degradation and escalation to human review when needed.