Production RAG Chatbot PDF Embeddings Cursor Rules Template

Overview

Cursor rules configuration for a production RAG chatbot SaaS with PDF upload, embeddings, citations, chat history, and admin analytics. This Cursor rules template targets a Python FastAPI stack using PostgreSQL and a vector database for embeddings, providing safe AI-assisted development with auditable actions.

When to Use These Cursor Rules

When building a scalable RAG chatbot SaaS that ingests PDFs and converts them to searchable embeddings.
When you need persistent chat history and citation-aware responses for end-users.
When you require admin analytics for usage, model performance, and document ingestion metrics.
When you want a structured, copyable ruleset to drop into your project root as .cursorrules.

Copyable .cursorrules Configuration

framework: python-fastapi
context: You are Cursor AI acting as the production-grade assistant for a RAG chatbot SaaS. Your stack includes: PDF upload, embeddings via a vector DB (Qdrant), citations, chat history, admin analytics, and secure REST APIs. Do not disclose internal secrets or bypass authentication.
codeStyle: pep8, black, isort
architecture: monorepo with app/api, app/models, app/services, app/ingest, app/analytics, and config/
authentication: OAuth2 JWT with short-lived access tokens; refresh tokens stored securely
security: TLS mandatory; secrets from env vars; input validation; pdf upload size limit 50MB; sanitize file names
database: PostgreSQL using SQLAlchemy ORM; Alembic migrations; strict typing; use async sessions
embeddings: vector DB (Qdrant) for embeddings; store embeddings alongside document IDs; enable caching for frequent queries
pdfIngestion: streaming upload; validate file type; chunked processing; store PDFs in object storage; index text with document metadata
searchAndRetrieval: embedding-based retrieval with cosine similarity; support citations by linking to source documents
testing: unit tests with pytest; integration tests for API endpoints; end-to-end tests for ingestion and retrieval
linting: ruff; mypy; pre-commit hooks; CI lint stage
prohibitedActions: 
  - Do not execute arbitrary code from user-uploaded PDFs.
  - Do not bypass authentication or log raw secrets.
  - Do not perform client-side rendering of citations; always fetch provenance from server logs.
  - Do not store raw PDFs beyond necessary retention period; delete after indexing when possible.

Recommended Project Structure

/
├── app/
│   ├── api/                 # FastAPI routers
│   │   └── v1/
│   │       ├── chat.py
│   │       └── pdf_upload.py
│   ├── models/              # SQLAlchemy ORM models
│   │   ├── user.py
│   │   ├── chat.py
│   │   ├── message.py
│   │   ├── document.py
│   │   ├── embedding.py
│   │   └── citation.py
│   ├── services/            # Business logic and adapters
│   │   ├── embeddings.py
│   │   ├── retrieval.py
│   │   └── analytics.py
│   ├── ingest/              # PDF ingestion workers
│   │   └── pdf_worker.py
│   ├── analytics/           # Admin analytics services
│   │   └── admin_dashboard.py
│   ├── config/              # App configuration
│   │   ├── settings.py
│   │   └── dependencies.py
│   └── main.py                # FastAPI entrypoint
├── tests/                   # pytest tests
│   ├── api/
│   └── end_to_end/
└── .env.sample

Core Engineering Principles

Design for scalable vector search with a dedicated vector store and efficient batching.
Prefer explicit data ownership: clear schemas for documents, embeddings, and citations.
Seal boundaries between ingestion, retrieval, and analytics via well-defined interfaces.
Ensure observability: structured logs, metrics, and tracing for critical paths.
Prefer safe AI prompts with constraints and guardrails; separate model prompts from user data.

Code Construction Rules

Use SQLAlchemy models with explicit relationships for Documents, Embeddings, and Citations.
Index PDF-derived text into the vector store synchronously for initial ingestion; async reindexing is allowed via background workers.
Validate PDF content and metadata before indexing; reject unsupported formats gracefully.
Implement idempotent ingestion to avoid duplicate embeddings for the same document.
Favor dependency injection and config-based secrets to keep code testable and portable.
Follow PEP 8 and Black formatting; enforce type hints across public APIs.
Do not bypass authentication; never log tokens or secrets; sanitize user input.
Do not rely on client-side rendering of citations; fetch provenance from server-side sources only.

Security and Production Rules

Always run with TLS; configure TLS certificates in deployment.
Store secrets in environment variables; do not commit .env files.
Implement role-based access control for admin analytics endpoints.
Limit PDF upload size to 50MB and scan for malware; quarantine suspicious files.
Enable rate limiting and request validation to protect endpoints from abuse.
Keep embeddings and derived data access logged and auditable.

Testing Checklist

Unit tests for models and services with synthetic data.
Integration tests for API endpoints covering upload, ingestion, and retrieval flows.
End-to-end tests simulating user sessions and admin analytics usage.
CI runs including lint, type-check, tests, and security scans.
Performance tests for vector search latency under typical load.

Common Mistakes to Avoid

Forgetting to validate and sanitize PDFs before ingestion.
Over-indexing documents; failing to deduplicate embeddings.
Exposing raw embeddings or PDFs through insecure endpoints.
Combining chat history with personal data without proper retention controls.

Related Cursor rules templates

Explore adjacent Cursor rules templates for similar stacks, workflows, and production constraints.

FAQ

What stack does this Cursor Rules Template target?

A production-grade RAG chatbot SaaS using Python FastAPI, PostgreSQL with SQLAlchemy, a vector database (Qdrant) for embeddings, and a secure PDF ingestion workflow.

How are PDFs uploaded and indexed securely?

Uploads are streamed to storage, validated, text extracted, and indexes are stored as embeddings in the vector DB; citations link to original documents.

How are citations surfaced in responses?

Citations are tied to source documents in the index and retrieved alongside the answer, with provenance included in responses.

How is admin analytics secured?

Admin analytics endpoints enforce RBAC, audit logs, and retention policies; data access is restricted to authorized roles.

What tests are required for CI?

Run unit and integration tests with pytest, enforce type checks with mypy, and run linting with ruff. Include end-to-end tests for ingestion, retrieval, and analytics dashboards as part of the CI pipeline.

Target User

Use Cases