Aside from adding LLMS.txt and Knowledge Graph (KG) files to your documentation website, you can use Retrieval-Augmented Generation (RAG) pipelines for your Documentation website.
RAG is a technique where a language model retrieves relevant external documents (e.g., from your docs or a database) and uses them as context to generate more accurate, grounded answers.
Preprocessing & Chunking
- Semantic chunking: Instead of naive chunking by paragraphs or headers, chunk based on semantic coherence (e.g., full code examples + explanations).
- Chunk metadata: Add structured metadata to each chunk (e.g.,
component: auth-api,language: python,version: v3.1) to enable more effective filtering in retrieval.
Example
<!-- chunk:start:auth-api:login -->
## Authentication API - Login
The `/auth/login` endpoint allows users to...
<!-- chunk:end -->These help parsers split docs precisely for chunking/indexing during RAG ingestion.
Embedding Strategies
- Hybrid search: Use both vector search and keyword-based filtering (e.g., BM25 or filtering by tags/components).
- Multi-granularity indexing: Index both fine-grained chunks (function-level) and higher-level concepts (module-level) to answer both “how” and “why” questions.