AGENTS.md Template for Batch Inference System Design

Overview

AGENTS.md template for batch inference system design provides a rigorous operating manual for single-agent and multi-agent orchestration of batched predictions. It defines roles, memory/context, handoffs, tool governance, and auditability for batch workflows that span data ingestion, preprocessing, model inference, and result validation.

Direct answer: This template guarantees a repeatable, auditable batch inference process and clear handoffs between planners, processors, evaluators, and domain specialists, enabling scalable multi-agent orchestration with human review when needed.

When to Use This AGENTS.md Template

Designing batch inference pipelines that require orchestration across multiple models, data sources, or environments.
Establishing governance for high-throughput predictions with reproducibility and audit trails.
Onboarding teams to a single source of truth for batch tasks, including handoffs and validation steps.

Copyable AGENTS.md Template

Below is a copyable AGENTS.md template block you can paste into your project to establish the operating context for batch inference AI coding agents.

# AGENTS.md

Project: Batch Inference System
Version: 1.0
Date: 2026-05-21

Overview:
This AGENTS.md defines an operating model for batch inference that uses a planner, orchestrator, and domain agents to process batched predictions with strict governance and auditability.

Agent roster and responsibilities:
- Planner: defines the batch plan, datasets, schema, and acceptance criteria.
- DataPreparer: validates input data, handles missing values, and enforces data quality rules.
- InferenceWorker: executes model inferences for batched inputs.
- Aggregator: collects, consolidates, and sorts batch results.
- Evaluator: computes metrics and validates results against acceptance criteria.
- Reviewer: performs human checks for edge cases or model drift.
- Auditor: records decisions, sources of truth, and audit trails.

Supervisor or orchestrator behavior:
- The BatchOrchestrator coordinates task assignments, retries, and escalation.
- Maintains a single source of truth in a central data store and a live activity log.
- Enforces memory/context sharing rules and triggers handoffs when criteria are met.

Handoff rules between agents:
- Planner -> DataPreparer: when batch plan is approved and data sources are ready.
- DataPreparer -> InferenceWorker: when data quality checks pass.
- InferenceWorker -> Aggregator: when batched results are produced.
- Aggregator -> Evaluator: when results are ready for validation.
- Evaluator -> Reviewer: when results need human review.
- Reviewer -> Auditor: after review, for logging and archival.

Context, memory, and source-of-truth rules:
- All decisions recorded in the central log and stored in the batch store.
- The authoritative context includes batch_id, dataset_version, model_versions, and acceptance criteria.

Tool access and permission rules:
- Agents may call data APIs, run in-repo code, and access model endpoints only within scoped permissions.
- Secrets and credentials are fetched from a secure vault; no plaintext secrets in logs.

Architecture rules:
- Microservice-like components with a centralized orchestrator, stateless workers, and a shared batch store.
- Idempotent operations; all actions are replayable from the batch log.

File structure rules:
- Maintain a single AGENTS.md at the project root for batch-inference workflows.
- Use clear folder names: planners/, data/, workers/, evaluators/, auditors/, configs/

Data, API, or integration rules:
- Input must conform to the BatchRequest schema; outputs must include batch_id and status.

Validation rules:
- All inferences must pass schema validations and quality metrics before handoffs.

Security rules:
- All endpoints require authentication; rotate credentials; no secrets in source code.

Testing rules:
- Include unit tests for each agent role; integration tests for end-to-end batch flow.

Deployment rules:
- Deploy orchestrator and workers together; feature flags for new models.

Human review and escalation rules:
- Escalate drift or anomalous results to Reviewer; require Auditor intervention for sensitive batches.

Failure handling and rollback rules:
- On failure, retry with backoff; if still failing, abort batch and notify stakeholders.

Things Agents must not do:
- Do not bypass validation, bypass approvals, or share PII in logs.

Recommended Agent Operating Model

In this batch inference workflow, the Planner, DataPreparer, InferenceWorker, Aggregator, Evaluator, Reviewer, and Auditor operate with clear decision boundaries. The Orchestrator acts as the supervisor, handling task assignment, retries, and escalation without performing model inferences directly. Handoffs require explicit success signals and share a common memory of the batch context.

Recommended Project Structure

batch-inference/
├── orchestrator/
├── planners/
├── data/
├── workers/
│   ├── preprocessor/
│   ├── inferencer/
│   └── postprocessor/
├── evaluators/
├── auditors/
├── configs/
├── tests/
└── docs/

Core Operating Principles

Single source of truth for batch context and decisions.
Idempotent, replayable actions with full traceability.
Explicit handoffs with required signals and acceptance criteria.
Strict access control and minimal privileges per agent.
Continuous validation and auditability at every stage.

Agent Handoff and Collaboration Rules

Planner and DataPreparer coordinate to ensure data quality before inference.
InferenceWorker must report results to Aggregator with a unique batch_id.
Evaluator may pause flow for human review; Reviewer completes checks and passes to Auditor.
Domain specialists may override thresholds only with explicit approval from Orchestrator.

Tool Governance and Permission Rules

Only allowed APIs per agent; secrets never appear in logs.
End-to-end encryption for batch data in transit and at rest.
Approval gates required for production-model launches or schema changes.

Code Construction Rules

Use modular, testable components; ensure compatibility across models and data formats.
All code must be documented; edge cases covered by tests.

Security and Production Rules

Implement role-based access control; rotate secrets; monitor for anomalous activity.
Use feature flags to deploy new batch inference tasks safely.

Testing Checklist

Unit tests for each agent role; integration tests for the full batch flow.
Load tests simulating peak batch sizes; verify backoff and retry behavior.
End-to-end tests including failure scenarios and rollbacks.

Common Mistakes to Avoid

Skipping validation or rushing dangerous model upgrades in production.
Unclear handoffs or missing source-of-truth; no audit trail.
Ignoring drift in batch data schemas or model performance.

FAQ

What is the purpose of this AGENTS.md Template for batch inference system design?

This AGENTS.md Template defines an operating context, roles, rules, and handoffs for batch inference using AI coding agents and multi-agent orchestration.

How do I use the template for multi-agent orchestration?

Follow the roles and handoff rules, populate the AGENTS.md with project-specific data, and maintain a single source of truth in the batch store.

What are the key tool governance rules in this template?

Access is scoped per agent, secrets live in a vault, and logs do not contain plaintext credentials; all API calls are auditable.

How should agent handoffs be structured in batch inference?

Handoffs require a clear signal (success/failure), data validation status, and a timestamp; responsible owner must acknowledge before the next agent proceeds.

What should happen on a batch failure?

Retry with exponential backoff; if the failure persists, abort the batch and trigger human review and escalation.

Target User

Use Cases