AGENTS.md Template: Recommendation System Architecture

Overview

Direct answer: The AGENTS.md template is a comprehensive operating manual for AI coding agents in a recommendation system architecture. It governs both single-agent and multi-agent orchestration across data ingestion, feature extraction, candidate generation, ranking, evaluation, serving, and monitoring.

It defines roles, memory, source of truth, tool access, governance, and escalation paths to keep work auditable in production.

When to Use This AGENTS.md Template

When you need a shared operating context for a multi-agent recommender system workflow.
To formalize agent roles (planner, implementer, evaluator, etc.) and handoff rules between steps.
To document how data, features, models, and results are sourced, stored, and versioned.
To enforce governance around tool usage, secrets, and production changes.
To enable reproducibility, auditing, and human review in critical recommendations.

Copyable AGENTS.md Template

Paste this block into your project as AGENTS.md to establish a single source of truth for the recommender workflow.

# AGENTS.md

Project Role
- Owner: Product Lead
- Sponsor: Platform Team
- Target System: Real-time and batch recommendation pipeline

Agent roster and responsibilities
- DataIngestAgent: collects events and feeds raw data to the system
- FeatureEngineerAgent: computes features for ranking
- CandidateGenAgent: proposes candidate items
- RankingAgent: scores candidates using the ranking model
- EvaluationAgent: validates performance and bias checks
- OrchestratorAgent: coordinates all agents, handles handoffs
- MonitorAgent: tracks quality, latency, and outages

Supervisor or orchestrator behavior
- OrchestratorAgent issues tasks, assigns memory checkpoints, and triggers evaluations
- All decisions are versioned and time-stamped

Handoff rules between agents
- DataIngestAgent -> FeatureEngineerAgent at feature-ready events
- FeatureEngineerAgent -> CandidateGenAgent when features are computed
- CandidateGenAgent -> RankingAgent when candidates exist
- RankingAgent -> EvaluationAgent for validation before serving
- If a handoff fails, OrchestratorAgent replays a step with a fresh memory snapshot

Context, memory, and source-of-truth rules
- Maintain a single source of truth for data, features, candidates, and scores
- Memory should persist between steps; cleared only on explicit reset
- All outputs must reference the source data with provenance metadata

Tool access and permission rules
- Agents may call data APIs, feature stores, model endpoints, and monitoring tools
- Secrets are injected only via the orchestrator and never hard-coded
- No direct production write access by data scientists

Architecture rules
- Microservice-like components with clear interfaces
- Idempotent operations and deterministic outputs where possible
- All models and feature definitions are versioned

File structure rules
- Maintain a single AGENTS.md at repository root
- Use robots-like naming for agents and pipelines
- Documentation and tests live under docs/ and tests/

Data, API, or integration rules when relevant
- All data contracts must be explicit
- Use batched data for training; streaming for serving
- Ensure input validation and schema checks at boundaries

Validation rules
- Validate data schema and feature shapes
- Regression tests for ranking logic
- A/B test guardrails and rollback procedures

Security rules
- Secrets stored in a vault; rotate on schedule
- Access controls per agent role
- Audit logs for all tool calls

Testing rules
- Unit tests for each agent
- Integration tests for end-to-end flow
- Shadow deployments for new ranking models

Deployment rules
- Canary deploys for ranking service
- Rollback plan and metrics-based gating
- Production dashboards for latency and accuracy

Human review and escalation rules
- Trigger human review for model drift or ethical concerns
- Escalation path to product owner and governance board

Failure handling and rollback rules
- If a step fails, revert to last known-good state
- Notify stakeholders and record incident

Things Agents must not do
- Do not access raw user data beyond approved scope
- Do not modify production configuration without approval
- Do not bypass orchestrator supervision

Recommended Agent Operating Model

The operating model defines who does what, how decisions are made, and how escalation works in a recommender workflow.

Planner agents propose tasks and sequencing across ingest, feature, candidate generation, and ranking
Implementer agents execute tasks with access to necessary tools and APIs
Reviewer agents monitor outputs for quality and bias; request adjustments when needed
Tester agents validate end-to-end correctness and performance
Researcher agents explore alternative signals or features under supervision
Domain specialists review context-sensitive results (e.g., legal or compliance constraints)
Escalation path to a human reviewer if drift, failure, or policy violations occur

Recommended Project Structure

projects/
  recommender-system/
    data/
    features/
    models/
    pipelines/
    agents/
      ingest/
      feature/
      candidate/
      rank/
      eval/
      orchestrator/
      monitor/
    tests/
    docs/
    config/
    secrets/

Core Operating Principles

Clear ownership and accountability for each agent
Deterministic and auditable decisions
Data privacy, security, and access controls baked in
Reproducibility of experiments and results
Transparent handoffs with explicit memory and provenance

Agent Handoff and Collaboration Rules

Define how planner, implementer, reviewer, tester, researcher, and domain specialist agents coordinate.

Planner publishes a task bundle with success criteria and memory snapshot
Implementer completes tasks and passes results with new memory state
Reviewer validates results and flags issues with explicit remediation steps
Tester runs end-to-end tests and signs off before deployment
Researcher suggests alternative signals; domain specialist approves changes

Tool Governance and Permission Rules

Commands and API calls require approval from orchestrator
Only approved secrets and vaults may be accessed by agents
All tool usage is auditable with time-stamped logs
Production system edits require rollback planning and sign-off

Code Construction Rules

Ensure idempotence for all operations
Version everything: data contracts, models, pipelines
Validate inputs and outputs at every boundary
Use dependency pinning and deterministic environments
Prefer modular, testable components over monoliths

Security and Production Rules

Encrypt data in transit and at rest; rotate keys
Enforce least privilege and role-based access
Audit all tool calls and changes to production
Implement fail-safe defaults and safe rollback paths

Testing Checklist

Unit tests for each agent
Integration tests for end-to-end workflow
Load and latency testing for ranking service
Security and access control tests

Common Mistakes to Avoid

Assuming single-agent suffices for complex workflows
Lack of memory management and provenance tracking
Improper handoffs leading to duplicated work or drift
Insufficient testing before production
Over-permissive tool access or secrets exposure

FAQ

What is the purpose of this AGENTS.md Template for a recommendation system?

This AGENTS.md Template provides a formal operating manual for AI coding agents in a recommendation system architecture, enabling multi-agent orchestration with clear rules, responsibilities, and governance.

How are agent handoffs defined in this workflow?

Handoffs occur at explicit boundaries (ingest to feature, feature to candidate, candidate to rank, rank to evaluation) with memory snapshots and provenance attached to each transfer; the orchestrator enforces transitions and rollback paths if needed.

What governance rules govern tool usage and secrets?

Tools and secrets are accessed through the orchestrator, with least-privilege permissions, vault-backed secrets, and auditable logs for every operation.

How is memory and source of truth managed?

All outputs reference the source data, and a central memory store persists across steps; data contracts and provenance metadata ensure traceability.

What are common pitfalls when implementing this pattern?

Common pitfalls include missing handoff definitions, drift in memory/state, overfitting to a single agent, insufficient testing, and unsafe production changes.

Target User

Use Cases