LLM Systems

LLM Agent Frameworks — Building Autonomous AI Agents

LLM agents extend language models beyond text generation by enabling tool use, multi-step planning, and persistent memory.

ReAct Pattern — Interleaves reasoning traces with actions in an observe-think-act-reflect loop
Tool Integration — Function calling and Toolformer enable structured interaction with external systems
Memory Systems — Short-term working memory and long-term persistent memory across sessions

"Start with simple ReAct patterns and add complexity incrementally — overly complex architectures are hard to debug."

LLM Agent Frameworks

LLM agents extend language models beyond text generation by enabling them to interact with external tools, plan multi-step tasks, and maintain memory across interactions. This tutorial covers the foundations and practical implementation of agentic systems.

What are LLM Agents?

The agent loop consists of:

Observe: Receive input from the environment
Think: Reason about the current state and goals
Act: Take an action using available tools
Reflect: Evaluate the outcome and update internal state

ReAct (Reasoning + Acting)

ReAct interleaves reasoning traces with actions:

from typing import List, Dict, Callable, Optional
import json

class ReActAgent:
    def __init__(
        self,
        llm,
        tools: Dict[str, Callable],
        max_steps: int = 10,
        verbose: bool = True
    ):
        self.llm = llm
        self.tools = tools
        self.max_steps = max_steps
        self.verbose = verbose
    
    def run(self, task: str) -> str:
        """Execute task using ReAct framework."""
        history = []
        current_state = f"Task: {task}\n\n"
        
        for step in range(self.max_steps):
            # Generate next action
            prompt = self._build_prompt(current_state, history)
            response = self.llm.generate(prompt)
            
            # Parse action from response
            action = self._parse_action(response)
            
            if action["type"] == "finish":
                return action["output"]
            
            # Execute action
            if action["tool"] in self.tools:
                observation = self.tools[action["tool"]](**action["args"])
            else:
                observation = f"Error: Unknown tool '{action['tool']}'"
            
            # Update history
            history.append({
                "thought": action.get("thought", ""),
                "action": action["tool"],
                "action_input": action["args"],
                "observation": observation
            })
            
            if self.verbose:
                print(f"Step {step + 1}:")
                print(f"  Thought: {action.get('thought', 'N/A')}")
                print(f"  Action: {action['tool']}")
                print(f"  Observation: {observation}\n")
            
            current_state += f"Observation: {observation}\n\n"
        
        return "Max steps reached without completion"
    
    def _build_prompt(self, state: str, history: List[Dict]) -> str:
        tools_desc = "\n".join([
            f"- {name}: {func.__doc__ or 'No description'}"
            for name, func in self.tools.items()
        ])
        
        history_text = ""
        for h in history:
            history_text += f"Thought: {h['thought']}\n"
            history_text += f"Action: {h['action']}\n"
            history_text += f"Action Input: {json.dumps(h['action_input'])}\n"
            history_text += f"Observation: {h['observation']}\n\n"
        
        return f"""You are a helpful assistant that solves tasks step by step.

Available tools:
{tools_desc}

{state}

{history_text}What should you do next? Use this format:

Thought: [your reasoning about what to do next]
Action: [tool name]
Action Input: [JSON arguments for the tool]

Or if you have the final answer:
Thought: [final reasoning]
Final Answer: [your answer to the task]"""
    
    def _parse_action(self, response: str) -> Dict:
        """Parse action from LLM response."""
        lines = response.strip().split("\n")
        result = {"type": "action"}
        
        for line in lines:
            if line.startswith("Thought:"):
                result["thought"] = line[len("Thought:"):].strip()
            elif line.startswith("Action:"):
                result["tool"] = line[len("Action:"):].strip()
            elif line.startswith("Action Input:"):
                try:
                    result["args"] = json.loads(line[len("Action Input:"):].strip())
                except json.JSONDecodeError:
                    result["args"] = {"input": line[len("Action Input:"):].strip()}
            elif line.startswith("Final Answer:"):
                result["type"] = "finish"
                result["output"] = line[len("Final Answer:"):].strip()
        
        return result

Function Calling

Modern LLMs support structured function calling:

import json
from typing import Any, Dict, List

class FunctionCallingAgent:
    def __init__(self, llm, functions: List[Dict[str, Any]]):
        self.llm = llm
        self.functions = {f["name"]: f for f in functions}
    
    def run(self, task: str, max_iterations: int = 5) -> str:
        messages = [{"role": "user", "content": task}]
        
        for _ in range(max_iterations):
            response = self.llm.chat(messages, functions=self.functions)
            
            if response.get("function_call"):
                func_name = response["function_call"]["name"]
                func_args = json.loads(response["function_call"]["arguments"])
                
                # Execute function
                result = self._execute_function(func_name, func_args)
                
                messages.append(response)
                messages.append({
                    "role": "function",
                    "name": func_name,
                    "content": json.dumps(result)
                })
            else:
                return response["content"]
        
        return "Max iterations reached"
    
    def _execute_function(self, name: str, args: Dict) -> Any:
        if name in self.functions:
            func = self.functions[name]["function"]
            return func(**args)
        return {"error": f"Unknown function: {name}"}

# Define functions for the agent
functions = [
    {
        "name": "search_web",
        "description": "Search the web for information",
        "parameters": {
            "type": "object",
            "properties": {
                "query": {"type": "string", "description": "Search query"}
            },
            "required": ["query"]
        },
        "function": lambda query: {"results": [f"Result for: {query}"]}
    },
    {
        "name": "calculate",
        "description": "Perform mathematical calculation",
        "parameters": {
            "type": "object",
            "properties": {
                "expression": {"type": "string", "description": "Math expression"}
            },
            "required": ["expression"]
        },
        "function": lambda expression: {"result": eval(expression)}
    }
]

Toolformer

Toolformer trains LLMs to use tools autonomously:

class ToolformerTrainer:
    def __init__(self, base_model, tokenizer):
        self.model = base_model
        self.tokenizer = tokenizer
    
    def prepare_training_data(self, texts: List[str], tools: List[Dict]):
        """Insert tool calls into training data."""
        training_examples = []
        
        for text in texts:
            # Identify potential tool call positions
            positions = self._find_tool_positions(text)
            
            for pos in positions:
                # Generate tool call
                tool_call = self._generate_tool_call(
                    text[:pos], text[pos:], tools
                )
                
                if tool_call and self._is_useful(tool_call, text):
                    # Create training example with tool call
                    example = self._insert_tool_call(text, pos, tool_call)
                    training_examples.append(example)
        
        return training_examples
    
    def _is_useful(self, tool_call: Dict, context: str) -> bool:
        """Check if tool call reduces perplexity."""
        # Compute perplexity without tool
        ppl_without = self._compute_perplexity(context)
        
        # Compute perplexity with tool result
        text_with_tool = context + tool_call["call"] + tool_call["result"]
        ppl_with = self._compute_perplexity(text_with_tool)
        
        return ppl_with < ppl_without

Memory Systems

Short-term Memory

Working memory for current task context:

from collections import deque

class ShortTermMemory:
    def __init__(self, max_size: int = 20):
        self.buffer = deque(maxlen=max_size)
        self.summary = ""
    
    def add(self, observation: str, action: str, result: str):
        self.buffer.append({
            "observation": observation,
            "action": action,
            "result": result
        })
    
    def get_context(self) -> str:
        context = self.summary + "\n\nRecent interactions:\n"
        for item in self.buffer:
            context += f"- {item['observation']} -> {item['action']}\n"
        return context
    
    def summarize(self, llm):
        """Summarize buffer contents to save context."""
        if len(self.buffer) > 10:
            content = "\n".join([
                f"Obs: {b['observation']}, Act: {b['action']}, Res: {b['result']}"
                for b in self.buffer
            ])
            self.summary = llm.generate(f"Summarize these interactions:\n{content}")
            self.buffer.clear()

Long-term Memory

Persistent memory across sessions:

import chromadb

class LongTermMemory:
    def __init__(self, collection_name: str = "agent_memory"):
        self.client = chromadb.Client()
        self.collection = self.client.create_collection(collection_name)
    
    def store(self, content: str, metadata: Dict = None):
        """Store information in long-term memory."""
        self.collection.add(
            documents=[content],
            metadatas=[metadata or {}],
            ids=[f"mem_{self.collection.count()}"]
        )
    
    def retrieve(self, query: str, k: int = 5) -> List[str]:
        """Retrieve relevant memories."""
        results = self.collection.query(
            query_texts=[query],
            n_results=k
        )
        return results["documents"][0] if results["documents"] else []
    
    def search_and_retrieve(self, context: str, query: str) -> str:
        """Search memory and format for context."""
        memories = self.retrieve(f"{context} {query}")
        return "\n".join([f"- {m}" for m in memories])

Agent Frameworks

LangChain

from langchain.agents import initialize_agent, Tool
from langchain.llms import OpenAI

def calculator(expression: str) -> str:
    """Calculate mathematical expressions."""
    return str(eval(expression))

def search(query: str) -> str:
    """Search the web."""
    return f"Search results for: {query}"

tools = [
    Tool(name="Calculator", func=calculator, description="Calculate math"),
    Tool(name="Search", func=search, description="Search the web")
]

llm = OpenAI(temperature=0)
agent = initialize_agent(tools, llm, agent="react-agent", verbose=True)
result = agent.run("What is 2 + 2 and what is the capital of France?")

LlamaIndex

from llama_index.agent import ReActAgent
from llama_index.tools import ToolMetadata
from llama_index.llms import OpenAI

def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Weather in {city}: 72°F, sunny"

tools = [
    ToolMetadata(
        name="get_weather",
        description="Get current weather for a city",
        fn=get_weather
    )
]

llm = OpenAI(model="gpt-4")
agent = ReActAgent.from_tools(tools, llm=llm, verbose=True)
response = agent.chat("What's the weather in Boston?")

CrewAI

from crewai import Agent, Task, Crew
from crewai_tools import SerperDevTool

search_tool = SerperDevTool()

researcher = Agent(
    role="Researcher",
    goal="Find accurate information",
    backstory="Expert researcher with attention to detail",
    tools=[search_tool],
    verbose=True
)

writer = Agent(
    role="Writer",
    goal="Write compelling content",
    backstory="Experienced writer with clear communication",
    verbose=True
)

research_task = Task(
    description="Research the latest AI developments",
    agent=researcher,
    expected_output="Comprehensive research summary"
)

writing_task = Task(
    description="Write a blog post based on research",
    agent=writer,
    expected_output="1000-word blog post"
)

crew = Crew(
    agents=[researcher, writer],
    tasks=[research_task, writing_task],
    verbose=True
)

result = crew.kickoff()

Practical Agent Implementation

class AutonomousAgent:
    def __init__(self, llm, tools: List[Callable], memory: ShortTermMemory):
        self.llm = llm
        self.tools = {t.__name__: t for t in tools}
        self.memory = memory
        self.goals = []
        self.plan = []
    
    def set_goal(self, goal: str):
        self.goals.append(goal)
        self.plan = self._create_plan(goal)
    
    def _create_plan(self, goal: str) -> List[str]:
        """Create multi-step plan for achieving goal."""
        prompt = f"""Create a step-by-step plan to achieve this goal:
{goal}

Available tools: {list(self.tools.keys())}

Plan (list steps as numbered items):"""
        
        response = self.llm.generate(prompt)
        steps = [s.strip() for s in response.split("\n") if s.strip().startswith(("1.", "2.", "3."))]
        return steps
    
    def execute_step(self) -> bool:
        """Execute next step in plan."""
        if not self.plan:
            return False
        
        current_step = self.plan[0]
        context = self.memory.get_context()
        
        prompt = f"""Current task: {current_step}

Context from previous steps:
{context}

What tool should you use and with what arguments?"""
        
        response = self.llm.generate(prompt)
        action = self._parse_action(response)
        
        if action["tool"] in self.tools:
            result = self.tools[action["tool"]](**action["args"])
            self.memory.add(current_step, action["tool"], str(result))
            self.plan.pop(0)
            return True
        
        return False
    
    def run(self, max_steps: int = 20):
        """Execute full plan."""
        for _ in range(max_steps):
            if not self.plan:
                break
            self.execute_step()
        
        return self.memory.get_context()

Summary

Practice Exercises

ReAct Agent: Implement a ReAct agent with calculator and search tools. Test it on multi-step reasoning tasks.
Function Calling: Build a function-calling agent with 5 tools. Compare its performance with ReAct.
Memory System: Implement both short-term and long-term memory. Test how memory affects agent performance on long tasks.
Multi-Agent: Create a two-agent system where one agent researches and another writes. Compare with single-agent performance.
Tool Learning: Implement a simplified Toolformer that learns to insert tool calls. Measure perplexity improvement.

What to Learn Next

-> Building Production LLM Applications Scaling agent frameworks to production with monitoring and reliability.

-> RAG System Design RAG is a foundational pattern for agent knowledge retrieval.

-> Retrieval Augmented Generation The core retrieval techniques that power agent memory systems.

-> Prompt Engineering Effective prompts are essential for reliable agent reasoning and action.

-> In-Context Learning Agents rely on in-context learning for few-shot task adaptation.

-> Chain-of-Thought Reasoning CoT reasoning underpins the thinking step in agent loops.

Previous: 19 - Mixture of Experts <- | Next: 21 - Instruction Tuning ->

LLM Agent Frameworks

LLM Agent Frameworks — Building Autonomous AI Agents

LLM Agent Frameworks

What are LLM Agents?

ReAct (Reasoning + Acting)

Function Calling

Toolformer

Memory Systems

Short-term Memory

Long-term Memory

Agent Frameworks

LangChain

LlamaIndex

CrewAI

Practical Agent Implementation

Summary

Practice Exercises

What to Learn Next

Need Expert LLM Help?