AGENTS.md TemplatesAGENTS.md Template for Kafka

Kafka Production Architecture AGENTS.md Template

AGENTS.md Template for Kafka production architecture: a concrete operating manual to govern single-agent and multi-agent workflows, with handoffs, tool governance, and security.

AGENTS.md TemplateKafkaproduction architecturemulti-agent orchestrationagent handoff rulestool governancehuman reviewsecurityworkflow templateKafka opsAGENTS.md Templates

Target User

Developers, SREs, Platform teams, Engineering managers

Use Cases

  • Kafka production architecture governance
  • Single-agent and multi-agent orchestration in Kafka
  • Agent handoffs and tool governance in streaming pipelines
  • Security and compliance for Kafka workloads

Markdown Template

Kafka Production Architecture AGENTS.md Template

# AGENTS.md

Project Role
- Kafka Production Architecture Lead (Platform Engineering) coordinating multi-agent orchestration for Kafka workloads.

Agent roster and responsibilities
- Planner: defines sub-goals (topics, configs, ACLs) and sequence of actions.
- Architect/Implementer: translates plan into cluster/config changes, applies changes to Kafka cluster and ecosystem.
- Verifier/Tester: validates health, config correctness, and topic compliance after changes.
- Researcher: gathers best practices for Kafka efficiency, security, and reliability; proposes improvements.
- Domain Specialist: provides subject-matter guidance on data models, schemas, and topic governance.
- Reviewer: ensures changes meet policy, security, and reliability standards.
- Orchestrator (Supervisor): coordinates all agents, tracks context/memory, and enforces memory/ownership rules.

Supervisor or orchestrator behavior
- Maintains a single source of truth for goals, context, and results.
- Triggers sub-goals, enforces idempotence, and performs rollbacks when validation fails.
- Logs decisions, sources, and outcomes for auditability.

Handoff rules between agents
- Handoffs occur at explicit sub-goals with goal completion signals and clear ownership transfer.
- Each handoff includes: goal, latest context, required inputs, expected outputs, and acceptance criteria.
- No handoffs without a validated checkpoint; failure prompts escalation.

Context, memory, and source-of-truth rules
- Context is stored in a versioned memory store linked to the current production plan.
- Source-of-truth: Kafka cluster state, topic configs, ACLs, and registry entries.
- Agents must reference memory and never rely on stale shared state.

Tool access and permission rules
- Access to Kafka admin CLI and cluster APIs is granted per role with audit, secrets handling, and rotation.
- Secrets must be retrieved from a secure vault; do not hard-code credentials.
- Production deployments require approval gates and observable change windows.

Architecture rules
- Idempotent operations; avoid side effects beyond the intended scope.
- Use consistent naming conventions for topics, groups, and ACLs.
- Respect clustering best practices: replication.factor, min.insync.replicas, and topic-level configs.

File structure rules
- Changes live under infra/kafka/production and agents/ folders.
- Config changes reside in config/, scripts in scripts/, and tests in tests/.

Data, API, or integration rules when relevant
- All topic data flows, schemas, and registry references must be versioned.
- Interaction with monitoring/telemetry uses defined APIs and reads only the allowed endpoints.

Validation rules
- Validate topic existence, partition count, replication, ACLs, and TLS/SASL configuration after changes.
- Run end-to-end smoke tests to confirm producer/consumer paths.

Security rules
- Enforce TLS, SASL, and ACLs for cluster access.
- Rotate credentials, enforce principle of least privilege, and audit all admin actions.

Testing rules
- Unit tests for agent code; integration tests against a test Kafka cluster; end-to-end tests for governance flows.

Deployment rules
- CI/CD for agent templates; production changes go through approval gates and change windows.

Human review and escalation rules
- Any security or high-risk configuration changes require human review by the Security/Platform team.
- Escalate to SRE on production incidents or policy violations.

Failure handling and rollback rules
- If a change fails validation, roll back to the previous known-good state and re-run the plan.
- Maintain an audit trail of rollback actions and outcomes.

Things Agents must not do
- Do not mutate production topics/configs without approval.
- Do not bypass memory/source-of-truth constraints.
- Do not run long-lived tasks without monitoring and escalation triggers.

Overview

Direct answer: This AGENTS.md template defines a Kafka production architecture agent workflow for single-agent and multi-agent orchestration, including roles, handoffs, tool governance, and security boundaries.

Expanded explanation: The template provides a copyable operating manual that codifies the agent roster, decision boundaries, and collaboration patterns required to operate a Kafka production environment with AI coding agents. It supports both isolated agent execution and multi-agent orchestration with explicit handoffs, memory, and source-of-truth rules to prevent context drift and architecture drift.

When to Use This AGENTS.md Template

  • When you need a concrete, paste-ready operating manual for Kafka production workflows involving AI coding agents.
  • When multiple agents (planner, implementer, reviewer, tester, researcher, domain specialist) must collaborate on topic/config changes, deployment, and monitoring.
  • When you require strict tool governance, memory management, and auditable handoffs to minimize risk in production.
  • When you want a replicable baseline for incident response, change validation, and rollback procedures.

Copyable AGENTS.md Template

# AGENTS.md

Project Role
- Kafka Production Architecture Lead (Platform Engineering) coordinating multi-agent orchestration for Kafka workloads.

Agent roster and responsibilities
- Planner: defines sub-goals (topics, configs, ACLs) and sequence of actions.
- Architect/Implementer: translates plan into cluster/config changes, applies changes to Kafka cluster and ecosystem.
- Verifier/Tester: validates health, config correctness, and topic compliance after changes.
- Researcher: gathers best practices for Kafka efficiency, security, and reliability; proposes improvements.
- Domain Specialist: provides subject-matter guidance on data models, schemas, and topic governance.
- Reviewer: ensures changes meet policy, security, and reliability standards.
- Orchestrator (Supervisor): coordinates all agents, tracks context/memory, and enforces memory/ownership rules.

Supervisor or orchestrator behavior
- Maintains a single source of truth for goals, context, and results.
- Triggers sub-goals, enforces idempotence, and performs rollbacks when validation fails.
- Logs decisions, sources, and outcomes for auditability.

Handoff rules between agents
- Handoffs occur at explicit sub-goals with goal completion signals and clear ownership transfer.
- Each handoff includes: goal, latest context, required inputs, expected outputs, and acceptance criteria.
- No handoffs without a validated checkpoint; failure prompts escalation.

Context, memory, and source-of-truth rules
- Context is stored in a versioned memory store linked to the current production plan.
- Source-of-truth: Kafka cluster state, topic configs, ACLs, and registry entries.
- Agents must reference memory and never rely on stale shared state.

Tool access and permission rules
- Access to Kafka admin CLI and cluster APIs is granted per role with audit, secrets handling, and rotation.
- Secrets must be retrieved from a secure vault; do not hard-code credentials.
- Production deployments require approval gates and observable change windows.

Architecture rules
- Idempotent operations; avoid side effects beyond the intended scope.
- Use consistent naming conventions for topics, groups, and ACLs.
- Respect clustering best practices: replication.factor, min.insync.replicas, and topic-level configs.

File structure rules
- Changes live under infra/kafka/production and agents/ folders.
- Config changes reside in config/, scripts in scripts/, and tests in tests/.

Data, API, or integration rules when relevant
- All topic data flows, schemas, and registry references must be versioned.
- Interaction with monitoring/telemetry uses defined APIs and reads only the allowed endpoints.

Validation rules
- Validate topic existence, partition count, replication, ACLs, and TLS/SASL configuration after changes.
- Run end-to-end smoke tests to confirm producer/consumer paths.

Security rules
- Enforce TLS, SASL, and ACLs for cluster access.
- Rotate credentials, enforce principle of least privilege, and audit all admin actions.

Testing rules
- Unit tests for agent code; integration tests against a test Kafka cluster; end-to-end tests for governance flows.

Deployment rules
- CI/CD for agent templates; production changes go through approval gates and change windows.

Human review and escalation rules
- Any security or high-risk configuration changes require human review by the Security/Platform team.
- Escalate to SRE on production incidents or policy violations.

Failure handling and rollback rules
- If a change fails validation, roll back to the previous known-good state and re-run the plan.
- Maintain an audit trail of rollback actions and outcomes.

Things Agents must not do
- Do not mutate production topics/configs without approval.
- Do not bypass memory/source-of-truth constraints.
- Do not run long-lived tasks without monitoring and escalation triggers.

Recommended Agent Operating Model

The recommended operating model assigns clear decision boundaries and escalation paths for Kafka production workflows. Planner defines sub-goals; Implementer carries out changes; Verifier validates results; Reviewer approves outcomes; Researcher proposes optimizations; Domain Specialist provides subject-matter guidance; Orchestrator enforces governance and memory discipline. Escalations go to SRE/Platform when policy or security risks are detected.

Recommended Project Structure

infra/kafka/production/            # Kafka production deployment configurations
infra/kafka/production/brokers/
infra/kafka/production/topics/
infra/kafka/production/configs/
infra/kafka/production/acl/

agents/planner/
agents/architect/
agents/implementer/
agents/verifier/
agents/reviewer/
agents/researcher/
agents/domain-specialist/
agents/orchestrator/

scripts/orchestrator/
scripts/validations/

config/agent-config.yaml
tests/integration/
tests/end-to-end/

README.md

Core Operating Principles

  • Single source of truth for goals, context, and outputs.
  • Idempotent and auditable changes with explicit approval gates.
  • Clear handoffs and ownership transfers between agents.
  • Memory discipline: avoid context drift by storing outputs in a versioned store.
  • Security-first: encrypt secrets, enforce least privilege, and audit actions.

Agent Handoff and Collaboration Rules

  • Planner to Implementer: pass sub-goals with inputs, outputs, and acceptance criteria.
  • Implementer to Verifier: submit change results and validation reports.
  • Verifier to Reviewer: request approval only after passing validation benchmarks.
  • Researcher to Domain Specialist: propose optimizations with risk assessment.
  • Orchestrator: maintain visibility, enforce memory rules, and trigger rollbacks when needed.

Tool Governance and Permission Rules

  • All Kafka admin actions are auditable; secrets are retrieved from a vault and rotated periodically.
  • Only approved tools may modify topics, configs, ACLs, or deployments.
  • Production changes require a change window and stakeholder sign-off.
  • Automated safeguards prevent accidental deletion of critical topics.

Code Construction Rules

  • Code must be idempotent and id-tagged with a stable version.
  • Follow naming conventions for topics, consumer groups, and ACLs.
  • All changes are traced to a sub-goal and traceable to the AGENTS.md template guidance.
  • Do not duplicate work or bypass memory/context rules.

Security and Production Rules

  • Use TLS/SASL for all cluster communications; enforce ACLs per topic and consumer group.
  • Store credentials in a secrets vault; rotate access tokens regularly.
  • All production changes undergo human review for risk assessment.

Testing Checklist

  • Unit tests for agent logic; mock Kafka interactions where possible.
  • Integration tests against a staging Kafka cluster; verify topic creation, configs, and ACLs.
  • End-to-end tests for the full orchestration flow, including handoffs and rollbacks.
  • Security tests: verify TLS, authentication, authorization, and secret rotation.

Common Mistakes to Avoid

  • Skipping memory rules and allowing context drift between agents.
  • Bypassing approval gates or attempting unsafe shortcuts in production.
  • Undercounting validation or skipping rollback procedures after failures.
  • Treating AGENTS.md as a tiny note; it is a serious operating manual for agent behavior.

Related implementation resources: AI Use Case for Sales Pipeline Reviews and Deal Risk Scoring and AI Use Case for Rental Applications and Document Checks.

FAQ

What is the purpose of this Kafka production AGENTS.md Template?

It provides a copyable AGENTS.md template that codifies roles, orchestration patterns, and governance for Kafka production workloads, enabling single-agent and multi-agent workflows with clear handoffs and memory rules.

Which agents are defined in this workflow and what are their responsibilities?

Planner defines goals, Architect/Implementer applies changes, Verifier tests and validates, Researcher proposes improvements, Domain Specialist provides subject-matter guidance, Reviewer approves, and Orchestrator coordinates all agents and enforces governance.

How are handoffs between agents managed?

Handoffs occur at explicit sub-goals with goal completion signals, clear ownership transfer, and acceptance criteria. The orchestrator ensures checkpoints before transitions.

What security and production rules are enforced by this template?

All actions use auditable access, secrets in vaults, TLS/SASL, ACLs, and approval gates for production changes.

Where should the AGENTS.md content be stored and how is memory managed?

Content is stored in a versioned memory store linked to the production plan; sources of truth include Kafka cluster state and topic configs; avoid stale data by explicit memory references.