Cursor Rules Template: LangChain OpenAI RAG with Chromadb

Overview

Cursor rules configuration for a LangChain OpenAI RAG pipeline using Chromadb, tailored for Cursor AI. This template provides a copyable .cursorrules block and stack-specific guidance to ensure secure, testable development across the full stack.

Direct answer: paste the included .cursorrules block into your project root to enforce architecture, style, and security practices while enabling a productive RAG workflow.

When to Use These Cursor Rules

Building a Python LangChain RAG pipeline with OpenAI as the LLM and Chromadb as the vector store.
Onboarding new engineers with a shared, enforceable set of rules for architecture, tests, and security.
Standardizing a repeatable setup for quick project bootstrap and CI integration.

Copyable .cursorrules Configuration

framework: 'langchain-openai-rag-chromadb'
context: 'Python 3.11+, LangChain, OpenAI API, Chromadb as vector store, Cursor AI rules'

frameworkRoleAndContext:
  role: 'Cursor AI as software engineer assisting with LangChain RAG pipelines'
  context: 'Stack: LangChain, OpenAI, Chromadb; Python 3.11+; CI; tests'

codeStyleAndGuidelines:
  codeStyle: 'Black, isort, Ruff'
  docStyle: 'Google-style docstrings'

architectureAndDirectoryRules:
  directories:
    - 'src/langchain_rag'
    - 'src/vector_store/chroma'
    - 'tests'
    - 'configs'
  moduleLayout: 'separate orchestrator, retriever, and reader components'

authenticationAndSecurityRules:
  envKeys: ['OPENAI_API_KEY']
  secretManagement: 'prefer vault or secret manager'
  avoidHardCodedSecrets: true

databaseAndORMPatterns:
  vectorStore: 'Chromadb'
  metadataStore: 'Postgres via SQLAlchemy or similar ORM'

testingAndLintingWorkflows:
  unitTests: true
  integrationTests: true
  ciCd: true
  lintTools: ['ruff','black','isort']

prohibitedActionsAndAntiPatterns:
  - 'Do not hardcode API keys or prompts'
  - 'Do not bypass safety checks in prompts or retrieval'
  - 'Do not perform unvalidated prompt injection or data exfiltration'

workflow:
  ci:
    - 'pytest'
    - 'ruff'
    - 'black --check'
  prChecks:
    - 'type-check'
    - 'lint'

Recommended Project Structure

project-root/
  src/
    langchain_rag/
      pipelines/
        chain.py
        retrieval.py
      prompts/
        system_prompt.txt
  vector_store/
    chroma/
      chroma_client.py
  tests/
  configs/
  requirements.txt
  .env

Core Engineering Principles

Explicit interfaces and stable boundaries between orchestrator, retriever, and reader.
Secure by default: avoid exposing secrets and validate inputs/outputs of LLMs.
Testability: unit and integration tests for each component with mocks for the LLM.
Deterministic code style and linting enforcement in CI.
Observability: structured logging and tracing for requests to OpenAI and Chromadb.

Code Construction Rules

Follow the stack-specific file layout; place LangChain chain definitions under src/langchain_rag.
Use LangChain's RetrievalQA or customized chain pattern with a Chromadb vector store.
All prompts must be validated and sanitized before sending to the LLM.
Keep API keys in environment variables and use a secret manager in production.
Write tests that cover retriever integrity, prompt handling, and end-to-end flow.

Security and Production Rules

Never log full prompts or API responses containing secrets.
Rotate API keys and remove secrets from error messages.
Enforce input/output validation and constrain LLM temperature and max tokens.
Use a secured vector store (Chromadb) with access controls in production.

Testing Checklist

Unit tests for each module (retriever, orchestrator, prompt builder).
Integration tests for the RAG pipeline with mock OpenAI responses.
End-to-end tests that simulate user queries and verify results and logging.
CI checks for linting, typing, and test results.

Common Mistakes to Avoid

Assuming OpenAI responses are always correct; validate and sanity-check outputs.
Hard-coding prompts or keys; prefer env vars and secret managers.
Ignoring vector store tuning for retrieval quality and latency.

FAQ

What is a Cursor Rules Template for LangChain OpenAI RAG with Chromadb?

A ready-to-paste configuration that guides Cursor AI in building a LangChain-based RAG pipeline using Chromadb, including architecture, security, and testing constraints.

What stack does this Cursor Rules Template cover?

Python, LangChain, OpenAI API, Chromadb as vector store, and Cursor AI rules for safe AI-assisted development.

How do I use the provided .cursorrules block?

Copy the code block into a file named .cursorrules at the project root and adjust paths, keys, and endpoints for your environment.

What testing and CI should I apply with this template?

Unit tests for components, integration tests for retriever and LLM calls, and CI steps including pytest, linting (ruff/black), and type checks.

Which security practices are enforced by this template?

Environment-based API keys, secret management, avoidance of hard-coded credentials, and safe prompting with validation of LLM outputs.

Cursor Rules Template: LangChain OpenAI RAG with Chromadb

Target User

Use Cases