Advanced RAG
Agentic RAG — Autonomous Research Agents
Agentic RAG goes beyond simple retrieve-and-generate. It uses LLM agents that can plan multi-step research strategies, use tools, decompose queries, and iteratively refine their search until they find comprehensive answers.
- Query Decomposition — Break complex questions into searchable sub-queries
- Multi-Step Reasoning — Retrieve, reason, retrieve again, refine
- Tool Use — Leverage search engines, databases, calculators, and APIs
The best researcher does not just search — they plan, investigate, and synthesize.
Agentic RAG Systems
Traditional RAG performs a single retrieval step. Agentic RAG enables the LLM to act as a research agent — decomposing complex questions, planning retrieval strategies, using multiple tools, and iteratively refining its search until it has comprehensive information.
DfAgentic RAG
Agentic RAG is a retrieval-augmented generation paradigm where the LLM acts as an autonomous agent that can: (1) decompose complex queries into sub-queries, (2) plan multi-step retrieval strategies, (3) use external tools (search, databases, APIs), (4) evaluate and refine results iteratively, and (5) synthesize information from multiple sources.
Query Decomposition
Breaking Down Complex Queries
DfQuery Decomposition
Query decomposition breaks a complex question into simpler sub-questions that can each be answered by a single retrieval. The answers to sub-questions are then combined to answer the original complex question.
def decompose_query(query, llm):
"""Decompose a complex query into sub-queries."""
prompt = f"""Break this complex question into simpler sub-questions that can each be answered independently.
Original question: {query}
Sub-questions (one per line):"""
response = llm.generate(prompt)
sub_queries = [q.strip() for q in response.split("\n") if q.strip()]
return sub_queries
Example
Original: "Compare the environmental impact of electric vehicles vs hydrogen fuel cell vehicles, considering manufacturing, operation, and disposal."
Sub-queries:
1. What is the environmental impact of electric vehicle manufacturing?
2. What is the environmental impact of hydrogen fuel cell vehicle manufacturing?
3. What are the operational emissions of electric vehicles?
4. What are the operational emissions of hydrogen fuel cell vehicles?
5. What is the end-of-life disposal impact of electric vehicles?
6. What is the end-of-life disposal impact of hydrogen fuel cell vehicles?
Multi-Step Retrieval Agent
Agent Architecture
class AgenticRAG:
def __init__(self, llm, retriever, tools):
self.llm = llm
self.retriever = retriever
self.tools = tools
self.memory = []
def research(self, query, max_steps=5):
"""Perform multi-step research on a query."""
# Step 1: Plan research strategy
plan = self.plan_research(query)
# Step 2: Execute research steps
context = []
for step in plan["steps"]:
if step["action"] == "retrieve":
docs = self.retriever.retrieve(step["query"], top_k=3)
context.extend(docs)
elif step["action"] == "use_tool":
result = self.tools[step["tool"]].execute(step["input"])
context.append({"type": "tool_output", "content": result})
elif step["action"] == "reason":
reasoning = self.llm.generate(
f"Based on the following information, {step['instruction']}:\n"
+ "\n".join([c["content"] for c in context])
)
context.append({"type": "reasoning", "content": reasoning})
# Step 3: Synthesize final answer
answer = self.synthesize(query, context)
return answer
def plan_research(self, query):
"""Plan a research strategy for the query."""
prompt = f"""Create a research plan for answering this question:
Question: {query}
Available tools: {list(self.tools.keys())}
Research plan (JSON format):
{{
"sub_questions": ["sub question 1", "sub question 2"],
"steps": [
{{"action": "retrieve", "query": "search query 1"}},
{{"action": "use_tool", "tool": "tool_name", "input": "input"}},
{{"action": "reason", "instruction": "what to analyze"}}
]
}}"""
response = self.llm.generate(prompt)
return json.loads(response)
Tool-Enhanced Retrieval
Available Tools
class ToolRegistry:
def __init__(self):
self.tools = {}
def register(self, name, tool):
self.tools[name] = tool
def search_web(self, query):
"""Search the web for information."""
results = web_search_api.search(query, top_k=5)
return "\n".join([f"{r['title']}: {r['snippet']}" for r in results])
def search_database(self, query):
"""Search a structured database."""
sql = self.llm.generate(f"Convert this to SQL: {query}")
results = database.execute(sql)
return str(results)
def calculate(self, expression):
"""Perform calculations."""
result = eval(expression) # Use safe evaluation in production
return str(result)
def lookup_definition(self, term):
"""Look up a definition or explanation."""
return self.retriever.retrieve(term, top_k=1)[0]
Tool Selection
DfTool Selection
Tool selection is the process of determining which tool to use for a given sub-task. The agent evaluates the query and available tools to choose the most appropriate one.
def select_tool(query, available_tools, llm):
"""Select the best tool for a given query."""
prompt = f"""Which tool would be best for answering this query?
Query: {query}
Available tools: {json.dumps({name: tool.description for name, tool in available_tools.items()})}
Tool name:"""
selected = llm.generate(prompt).strip()
return available_tools.get(selected, available_tools["search_web"])
Iterative Refinement
Self-Critique Loop
DfSelf-Critique
Self-critique evaluates the current answer against the query and retrieved context, identifies gaps or weaknesses, and triggers additional retrieval or reasoning to address them.
def iterative_refinement(query, initial_answer, retriever, llm, max_iterations=3):
"""Iteratively refine an answer through self-critique."""
current_answer = initial_answer
for i in range(max_iterations):
# Critique the current answer
critique = critique_answer(query, current_answer, llm)
if critique["quality"] > 0.9:
break # Answer is good enough
# Identify gaps
gaps = critique["gaps"]
# Retrieve additional information for gaps
for gap in gaps:
additional_docs = retriever.retrieve(gap, top_k=2)
current_answer = integrate_information(
current_answer, additional_docs, llm
)
return current_answer
def critique_answer(query, answer, llm):
"""Critique an answer and identify gaps."""
prompt = f"""Critique this answer to the question:
Question: {query}
Answer: {answer}
Evaluate:
1. Is the answer complete? (covers all aspects of the question)
2. Is the answer accurate? (supported by evidence)
3. Are there gaps or missing information?
Quality score (0-1):
Gaps (list specific missing information):"""
response = llm.generate(prompt)
return parse_critique(response)
Multi-Source Synthesis
Combining Information from Multiple Retrievals
def synthesize_multi_source(query, sources, llm):
"""Synthesize information from multiple sources into a coherent answer."""
source_summaries = []
for i, source in enumerate(sources):
summary = llm.generate(f"Summarize the following information relevant to '{query}':\n{source['content']}")
source_summaries.append(f"Source {i+1} ({source['type']}): {summary}")
prompt = f"""Synthesize the following information from multiple sources into a comprehensive answer.
Question: {query}
Sources:
{chr(10).join(source_summaries)}
Provide a comprehensive, well-structured answer that integrates information from all sources:"""
answer = llm.generate(prompt)
return answer
Agentic RAG vs Standard RAG
| Feature | Standard RAG | Agentic RAG |
|---|---|---|
| Query handling | Single query | Decomposed sub-queries |
| Retrieval | Single step | Multi-step, iterative |
| Tools | Retriever only | Search, databases, APIs, calculators |
| Reasoning | None | Explicit reasoning steps |
| Self-critique | None | Evaluates and refines |
| Complexity | Simple | Complex research tasks |
| Latency | Low (200ms) | High (2-10s) |
| Best for | Simple factual questions | Complex analytical questions |
Agentic RAG is most valuable for complex research tasks that require synthesizing information from multiple sources, performing calculations, or reasoning through multi-step problems. For simple factual questions, standard RAG is more efficient.
Practice Exercises
-
Query Decomposition: Implement a query decomposition system that breaks complex questions into 3-5 sub-queries. Evaluate whether the decomposed queries cover all aspects of the original question.
-
Tool Selection: Build a tool registry with web search, database lookup, calculator, and definition lookup. Implement tool selection logic and evaluate accuracy.
-
Iterative Refinement: Implement a self-critique loop that identifies gaps in answers and triggers additional retrieval. How many iterations are typically needed?
-
Multi-Source Synthesis: Test your agentic RAG system on questions requiring information from 3+ sources. Compare the quality to single-retrieval RAG.
Key Takeaways
Summary: Agentic RAG Systems
- Query decomposition breaks complex questions into searchable sub-queries
- Multi-step retrieval performs iterative search and reasoning
- Tool use leverages search engines, databases, calculators, and APIs
- Self-critique evaluates answers and identifies gaps
- Iterative refinement progressively improves answer quality
- Multi-source synthesis combines information from diverse sources
- Planning enables strategic research rather than blind search
- Tradeoff: higher quality at the cost of higher latency
What to Learn Next
-> LLM Agent Frameworks Building autonomous agents with LLMs.
-> Tool Use and Function Calling Teaching LLMs to use external tools.
-> Self-RAG and Adaptive Retrieval When to retrieve and when to rely on knowledge.
-> RAG System Design Advanced RAG architecture and design patterns.
-> Multi-Agent Systems Coordinating multiple agents for complex tasks.
-> Planning and Reasoning in Agents How agents plan and execute multi-step tasks.