Retrieval-Augmented Generation
What is RAG?
Retrieval-Augmented Generation (RAG) combines the power of pre-trained language models with external knowledge retrieval. It retrieves relevant documents from a knowledge base and uses them to generate more accurate, up-to-date responses.
Why RAG?
- Reduced Hallucination: Grounds responses in retrieved facts
- Up-to-date Information: Access to current data
- Domain Specialization: Can use domain-specific knowledge bases
- Transparency: Sources can be cited
RAG Implementation
from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.llms import OpenAI
from langchain.chains import RetrievalQA
class RAGSystem:
def __init__(self, documents):
self.embeddings = OpenAIEmbeddings()
self.vectorstore = FAISS.from_documents(documents, self.embeddings)
self.llm = OpenAI(temperature=0)
def query(self, question, k=3):
# Retrieve relevant documents
retriever = self.vectorstore.as_retriever(search_kwargs={"k": k})
relevant_docs = retriever.get_relevant_documents(question)
# Build context
context = "\n\n".join([doc.page_content for doc in relevant_docs])
# Generate response
prompt = f"""Answer the question based on the context below.
Context: {context}
Question: {question}
Answer:"""
response = self.llm(prompt)
return response, relevant_docs
Advanced RAG Techniques
Chunking Strategies
# Different ways to chunk documents
from langchain.text_splitter import (
RecursiveCharacterTextSplitter,
CharacterTextSplitter,
MarkdownHeaderTextSplitter
)
# Strategy 1: Fixed-size chunks
fixed_splitter = CharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separator="\n"
)
# Strategy 2: Recursive character splitting
recursive_splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200,
separators=["\n\n", "\n", ". ", " ", ""]
)
# Strategy 3: Markdown-aware splitting
headers_to_split = [
("#", "Header 1"),
("##", "Header 2"),
("###", "Header 3"),
]
md_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split)
Evaluation Metrics
| Metric | Description |
|---|---|
| Recall@K | Fraction of relevant documents retrieved |
| MRR | Mean Reciprocal Rank |
| NDCG | Normalized Discounted Cumulative Gain |
| Faithfulness | How well answer is grounded in context |
Summary
RAG combines retrieval and generation for accurate, grounded responses. It's essential for building AI systems that need current or domain-specific knowledge.
Next: We'll explore vector databases and embeddings.