AGENTS.md Template for Stream Processing Architecture
AGENTS.md template page for stream processing architecture enabling single-agent and multi-agent workflows with clear handoffs and governance.
Target User
Developers, architects, and engineering leaders designing real-time streaming pipelines.
Use Cases
- Real-time streaming data ingestion
- Event-driven enrichment and normalization
- Multi-agent orchestration across pipeline stages
- Auditable handoffs and rollback in streaming workflows
Markdown Template
AGENTS.md Template for Stream Processing Architecture
# AGENTS.md
Project role
- Stream Processing Architect and Orchestrator for real-time pipelines
Agent roster and responsibilities
- Planner: designs pipeline stages, defines handoffs, sets windowing and checkpointing strategy
- Implementer: implements stages in chosen tech (e.g., Python, Java/Scala) and wires components to the stream processor
- Reviewer: validates correctness, performance, memory usage, and drift risks
- Tester: builds test data and runs unit/integration tests against streaming components
- Researcher: investigates data quality issues, anomalies, and improvement opportunities
- Domain Specialist: ensures domain-specific rules and SLAs are applied
Supervisor or orchestrator behavior
- The orchestrator sequences stages, triggers tasks, propagates context, and monitors progress across agents
- It logs handoffs and artifacts to a central store with versioning
Handoff rules between agents
- Ingest > Normalize > Enrich > Windowed-Aggregation > Sink
- Each handoff must include: event context shard, version, and artifacts folder
- Handoff artifacts are stored in the central metadata store with timestamp
Context, memory, and source-of-truth rules
- Use a single source-of-truth metadata store for event schemas and pipeline state
- Maintain per-event context with a bounded memory for short-lived state; persist long-lived state to a storage tier
- All decisions must reference the source-of-truth artifacts
Tool access and permission rules
- Access to Kafka topics, Flink/Spark clusters, and S3 is restricted by role
- Secrets are retrieved via a vault; do not hard-code credentials
- Production system changes require approval and testing
Architecture rules
- Exactly-once processing where possible; idempotent operations
- Backpressure protection and checkpointing every N records or time window
- Stateless design for planner and orchestrator components where feasible
File structure rules
- All components live under a single streams-architecture project tree
- Namespaces and folders reflect pipeline stages and agent roles
Data, API, or integration rules when relevant
- Define event schemas and data contracts; enforce schema validation at ingest and transform
- Use idempotent APIs and clear retry policies
Validation rules
- Validate against synthetic and real data with drift checks
- Ensure window results are deterministic for replay
Security rules
- Enforce least privilege; secrets vault integration; audit trails for access and changes
Testing rules
- Include unit tests for each stage and integration tests for end-to-end flows
- Run tests on a staging streaming cluster before production
Deployment rules
- Rollouts must pass all tests; can be blue/green or canary
- Rollback to previous checkpoint if anomalies are detected
Human review and escalation rules
- Escalate critical failures to a human reviewer and the platform owner
- All escalation decisions are logged and auditable
Failure handling and rollback rules
- On failure, rollback to the last good checkpoint and replay from there
- Notify the orchestrator and domain specialist on anomalies
Things Agents must not do
- Do not bypass the orchestrator or access production resources directly
- Do not modify event schemas without approval
- Do not run unchecked retries or runaway processesOverview
AGENTS.md Template for Stream Processing Architecture defines an operating manual for real-time data pipelines. It governs single-agent and multi-agent workflows across ingestion, normalization, enrichment, windowed processing, and sinking to storage or analytics. It establishes the operating context, memory rules, and orchestration boundaries so agents can operate predictably and with auditable handoffs.
Direct answer: This template provides concrete roles, handoff rules, memory and source-of-truth strategies, and governance scaffolding for stream processing AI coding agents in a multi-agent orchestration setting.
When to Use This AGENTS.md Template
- When designing a real-time streaming pipeline using Kafka, Flink, or Spark structured streaming.
- When multiple agents coordinate steps such as ingest, normalize, enrich, window, and sink.
- When governance, reproducibility, and auditable handoffs are required.
- When you need explicit memory and source-of-truth rules to avoid context drift.
- When you must enforce tool governance and security during pipeline operations.
Copyable AGENTS.md Template
# AGENTS.md
Project role
- Stream Processing Architect and Orchestrator for real-time pipelines
Agent roster and responsibilities
- Planner: designs pipeline stages, defines handoffs, sets windowing and checkpointing strategy
- Implementer: implements stages in chosen tech (e.g., Python, Java/Scala) and wires components to the stream processor
- Reviewer: validates correctness, performance, memory usage, and drift risks
- Tester: builds test data and runs unit/integration tests against streaming components
- Researcher: investigates data quality issues, anomalies, and improvement opportunities
- Domain Specialist: ensures domain-specific rules and SLAs are applied
Supervisor or orchestrator behavior
- The orchestrator sequences stages, triggers tasks, propagates context, and monitors progress across agents
- It logs handoffs and artifacts to a central store with versioning
Handoff rules between agents
- Ingest > Normalize > Enrich > Windowed-Aggregation > Sink
- Each handoff must include: event context shard, version, and artifacts folder
- Handoff artifacts are stored in the central metadata store with timestamp
Context, memory, and source-of-truth rules
- Use a single source-of-truth metadata store for event schemas and pipeline state
- Maintain per-event context with a bounded memory for short-lived state; persist long-lived state to a storage tier
- All decisions must reference the source-of-truth artifacts
Tool access and permission rules
- Access to Kafka topics, Flink/Spark clusters, and S3 is restricted by role
- Secrets are retrieved via a vault; do not hard-code credentials
- Production system changes require approval and testing
Architecture rules
- Exactly-once processing where possible; idempotent operations
- Backpressure protection and checkpointing every N records or time window
- Stateless design for planner and orchestrator components where feasible
File structure rules
- All components live under a single streams-architecture project tree
- Namespaces and folders reflect pipeline stages and agent roles
Data, API, or integration rules when relevant
- Define event schemas and data contracts; enforce schema validation at ingest and transform
- Use idempotent APIs and clear retry policies
Validation rules
- Validate against synthetic and real data with drift checks
- Ensure window results are deterministic for replay
Security rules
- Enforce least privilege; secrets vault integration; audit trails for access and changes
Testing rules
- Include unit tests for each stage and integration tests for end-to-end flows
- Run tests on a staging streaming cluster before production
Deployment rules
- Rollouts must pass all tests; can be blue/green or canary
- Rollback to previous checkpoint if anomalies are detected
Human review and escalation rules
- Escalate critical failures to a human reviewer and the platform owner
- All escalation decisions are logged and auditable
Failure handling and rollback rules
- On failure, rollback to the last good checkpoint and replay from there
- Notify the orchestrator and domain specialist on anomalies
Things Agents must not do
- Do not bypass the orchestrator or access production resources directly
- Do not modify event schemas without approval
- Do not run unchecked retries or runaway processes
Recommended Agent Operating Model
The agent roles collaborate in a constrained, well-defined loop with a clear decision boundary. Planner outlines the pipeline, Implementer builds the stage, Reviewer validates, Tester tests, Researcher investigates issues, Domain Specialist enforces domain-specific rules. The Supervisor ensures compliance, memory usage, and handoff correctness. Escalation paths exist for human review when risk grows beyond thresholds.
Recommended Project Structure
streams-architecture/
orchestrator/
agents/
planner/
implementer/
reviewer/
tester/
researcher/
domain-specialist/
pipelines/
ingest/
normalize/
enrich/
windowing/
sink/
configs/
data/
tests/
docs/
Core Operating Principles
- Each agent has a clearly defined role and decision boundary
- Actions are auditable with explicit handoffs
- Memory is guarded; source-of-truth artifacts are required for decisions
- Tool access is restricted by role and secret management
- Handoffs are explicit, versioned, and logged
- Multi-agent orchestration is supervised and recoverable
Agent Handoff and Collaboration Rules
- Planner to Implementer: provide stage designs, data contracts, and required artifacts
- Implementer to Reviewer: present implementation details, tests, and performance metrics
- Reviewer to Tester: approve test plan and pass criteria
- Tester to Researcher: report failures and investigate root causes
- Domain Specialist to Orchestrator: approve domain-specific constraints and SLAs
Tool Governance and Permission Rules
- Access control to streaming clusters, topics, and storage must follow RBAC
- Secrets must be retrieved from a centralized vault; no hard-coded credentials
- All API calls require authenticated tokens with scoped permissions
- Production changes require validation and approval gates
Code Construction Rules
- Write idempotent transforms; avoid non-deterministic side effects
- Checkpoint frequently and document windowing logic
- Use schemas and validation at every transform
- Document data contracts and error handling in code comments
Security and Production Rules
- Encrypt data in transit and at rest; audit data access
- Limit blast radius; isolate components in staging before production
- Monitor for anomalies; implement alerting on failures
Testing Checklist
- Unit tests for each stage
- Integration tests for end-to-end flow
- End-to-end tests with replay and checkpoint validation
- Performance tests under backpressure
Common Mistakes to Avoid
- Skipping formal handoff artifacts
- Allowing context drift without a source of truth
- Bypassing the orchestrator or skipping tests
- Unnecessary or unsafe production changes without approvals
Related implementation resources: AI Use Case for Sales Pipeline Reviews and Deal Risk Scoring and AI Use Case for Corporate Event Managers Using Slack To Orchestrate Day-Of Venue Tasks Across Multi-Department Teams.
FAQ
What is the purpose of this AGENTS.md Template for stream processing?
This template defines roles, handoffs, memory, governance, and escalation for real-time pipelines to enable predictable multi-agent orchestration.
Who are the core agent roles in this template?
Planner, Implementer, Reviewer, Tester, Researcher, and Domain Specialist with orchestrator supervision.
How are agent handoffs enforced in the workflow?
Handoffs include context artifacts, versioned artifacts, and logs; the orchestrator validates completion before proceeding.
What security and production rules apply to this AGENTS.md?
RBAC, vault-based secrets, restricted production changes, and auditable change management.
How do you validate and test the pipeline before deployment?
Run unit and integration tests for each stage, perform end-to-end tests with replay, and validate on staging before prod.