Safe Tool Calling Architecture AGENTS.md Template

Overview

This AGENTS.md template defines a safe tool calling architecture for AI coding agents. It governs both single-agent tool calls and multi-agent orchestration, providing a reproducible operating manual for planning, execution, handoffs, memory, and governance. Direct answer: use this template to enforce tool access controls, escalation paths, and auditable decisions across all agents.

When to Use This AGENTS.md Template

When starting a new safe tool calling project or prototype.
When implementing multi-agent orchestration with a clear planner, implementer, reviewer, and tester.
When you need explicit handoff rules, memory sources, and a single source of truth for outputs.
When enforcing tool governance, secrets management, and security constraints.
When you require a reusable operating manual that can be copied into a repository as AGENTS.md.

Copyable AGENTS.md Template

# AGENTS.md

Project role: Safe Tool Calling Architect for AI coding agents and multi-agent orchestration.

Agent roster and responsibilities:
- Planner: Develop high-level execution plan, identify required tools, define success criteria, and produce a plan document.
- Implementer: Translate the plan into concrete steps, perform tool calls, generate artifacts, and enforce tool restrictions.
- Reviewer: Validate outputs for correctness, security, and alignment with requirements.
- Tester: Execute end-to-end checks, regression tests, and scenario validation.
- Researcher: Gather domain data and verify sources; provide context for tool usage.
- Domain Specialist: Apply domain constraints, formatting, and specialized checks.

Supervisor or orchestrator behavior:
- The Orchestrator coordinates planning, tool gatekeeping, memory management, and escalation to human review when needed.
- All agents must publish a plan, work in memory-safe context, and respect the source-of-truth rules.

Handoff rules between agents:
- Planner -> Implementer: share Plan Document and required tools.
- Implementer -> Reviewer: share artifacts, results, and any deviations from the plan.
- Reviewer -> Implementer: return feedback and required changes.
- Implementer -> Tester: hand off validated outputs for end-to-end checks.
- Researcher/Domain Specialist can insert context at any stage via the Orchestrator, and must reference sources.

Context, memory, and source-of-truth rules:
- Memory stores outputs and sources of truth with versioned references.
- All claims must cite source data, code, or tool outputs.
- Do not rely on ephemeral memory for long-term decisions.

Tool access and permission rules:
- Tools must be accessed through approved adapters with least-privilege permissions.
- Secrets must be stored securely; do not embed tokens in code or memory.
- Access is granted by the Orchestrator based on the plan and risk assessment.

Architecture rules:
- Central orchestrator mediates all tool calls and data flows.
- Each agent runs in a sandboxed context with explicit boundaries.

File structure rules:
- Use a single AGENTS.md at the project root documenting the operating model.
- Store agent-specific artifacts under agents/ subdirectories.

Data, API, or integration rules when relevant:
- Define API contracts, data schemas, and response formats in a shared schema.
- Validate inputs and outputs against contracts before handing off.

Validation rules:
- All outputs must be traceable to a plan and set of inputs.
- Validate with unit and integration checks before production use.

Security rules:
- Do not expose secrets; rotate credentials; audit tool usage.
- Security reviews required for high-risk tool calls.

Testing rules:
- Include unit tests per agent, integration tests for handoffs, and end-to-end tests of the workflow.

Deployment rules:
- Gate deployments with approval checks; roll back on failure.

Human review and escalation rules:
- Escalate to human review when confidence is below threshold or security risk is detected.

Failure handling and rollback rules:
- Maintain last known-good plan and outputs; revert any changes beyond the rollback window.

Things Agents must not do:
- Do not bypass the orchestrator or tool governance.
- Do not leak secrets or use tools outside approved adapters.
- Do not continue after a failure without a plan to recover.

Recommended Agent Operating Model

The planner defines the orchestration strategy; the implementer executes tool calls within the plan; the reviewer ensures correctness and safety; the tester validates end-to-end flows; researchers and domain specialists provide context and constraints. Escalation paths are driven by confidence thresholds and risk assessments. If a decision is outside policy, escalate to human review.

Recommended Project Structure

project-root/
  agents/
    planner/
      src/
    implementer/
      src/
    reviewer/
      src/
    tester/
      src/
    researcher/
      src/
    domain-specialist/
      src/
  orchestrator/
    src/
  memory/
  tools/
  configs/
  data/
  integrations/
  tests/
  deployments/

Core Operating Principles

Single source of truth with explicit memory and source citations.
Deterministic handoffs with auditable artifacts.
Least-privilege tool access and strict validation.
Clear escalation to human review for uncertain or high-risk actions.
Accountability through versioned plans and outputs.

Agent Handoff and Collaboration Rules

Planner communicates plan and success criteria to Implementer; Implementer reports back with artifacts and deviations.
Implementer requires Reviewer approval before final validation unless plan permits autonomous execution with risk mitigations.
Researcher and Domain Specialist can inject context by adding sources and constraints to the memory.
Tester enforces end-to-end checks and reports failures to the Orchestrator.

Tool Governance and Permission Rules

All tool calls are mediated by the Orchestrator; direct tool calls are prohibited.
Secrets must be retrieved from a vault; never stored in memory or AGENTS.md.
Production tools require approvals and change management processes.

Code Construction Rules

Code must be modular, well-typed, and comply with shared schemas.
All calls must be validated against contracts before execution.
Follow the agent-based plan unless deviations are approved by the Planner.

Security and Production Rules

Policy-compliant tool usage; secrets are rotated and access is audited.
Production workflows require guardrails and runtime monitoring.

Testing Checklist

Unit tests for each agent; integration tests for handoffs; end-to-end tests of the workflow.
Security and access tests; failure-rollback tests; performance checks.

Common Mistakes to Avoid

Skipping human review for high-risk tool calls.
Over-sharing context or leaking secrets in memory or logs.
Hands-off ambiguity leading to drift in the plan.

FAQ

What is the purpose of this AGENTS.md Template?

This template provides a copyable operating manual for safe tool calling and multi-agent orchestration.

How are memory and sources tracked?

Memory stores versioned outputs with citations to data sources and tool results as the source of truth.

How are agent handoffs enforced?

Handoffs follow defined plan artifacts and gated transitions coordinated by the Orchestrator.

What happens on a tool call failure?

Failure triggers rollback to last known-good plan, with escalation to human review and re-planning.

How do I test this workflow?

Run unit, integration, and end-to-end tests across planners, implementers, reviewers, and testers.

Target User

Use Cases