LLM Agents

Tool Use and Function Calling — LLMs That Act in the World

Tool use transforms LLMs from text generators into action-taking agents. By defining functions with schemas, LLMs can call APIs, query databases, control robots, and interact with any external system.

Function Schemas — Define tools with structured descriptions
Parameter Extraction — LLMs extract parameters from natural language
Error Handling — Graceful recovery from tool execution failures

An LLM without tools is a brain without hands.

Tool Use and Function Calling

Tool use enables LLMs to go beyond text generation by interacting with external systems. The LLM receives a description of available tools, decides when and how to use them, and interprets the results to continue its reasoning.

DfTool Use

Tool use (or function calling) is the ability of an LLM to generate structured function calls that invoke external tools (APIs, databases, calculators, etc.). The LLM analyzes the user's request, determines which tool is appropriate, extracts the necessary parameters, and generates a function call in the expected format.

Function Schema Design

OpenAI Function Calling Format

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g., 'San Francisco, CA'"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_database",
            "description": "Search a product database",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    },
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "food"],
                        "description": "Product category"
                    },
                    "max_price": {
                        "type": "number",
                        "description": "Maximum price filter"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

Tool Description Best Practices

Good tool descriptions are critical for LLM tool selection. Include: (1) what the tool does, (2) when to use it, (3) what parameters mean, (4) example inputs and outputs.

# Good description
{
    "name": "get_stock_price",
    "description": "Retrieves the current stock price and basic metrics for a given ticker symbol. Use this when the user asks about stock prices, market value, or financial metrics of a company. Returns current price, daily change, volume, and market cap.",
    "parameters": {
        "ticker": {
            "type": "string",
            "description": "Stock ticker symbol (e.g., 'AAPL' for Apple, 'GOOGL' for Alphabet). Must be a valid US exchange ticker."
        }
    }
}

# Bad description (too vague)
{
    "name": "get_data",
    "description": "Gets some data",
    "parameters": {
        "input": {"type": "string"}
    }
}

Function Calling Implementation

import json
import openai

class ToolExecutor:
    def __init__(self, tools, functions):
        self.tools = tools
        self.functions = functions  # name -> callable mapping
    
    def execute(self, function_name, arguments):
        """Execute a function call."""
        if function_name not in self.functions:
            return {"error": f"Unknown function: {function_name}"}
        
        try:
            result = self.functions[function_name](**arguments)
            return {"result": result}
        except Exception as e:
            return {"error": str(e)}
    
    def run_with_tools(self, user_message, model="gpt-4"):
        """Run a conversation with tool use."""
        messages = [{"role": "user", "content": user_message}]
        
        while True:
            response = openai.ChatCompletion.create(
                model=model,
                messages=messages,
                tools=self.tools,
                tool_choice="auto"
            )
            
            message = response.choices[0].message
            
            # Check if the model wants to use tools
            if message.tool_calls:
                # Add assistant message with tool calls
                messages.append(message)
                
                # Execute each tool call
                for tool_call in message.tool_calls:
                    function_name = tool_call.function.name
                    arguments = json.loads(tool_call.function.arguments)
                    
                    result = self.execute(function_name, arguments)
                    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": json.dumps(result)
                    })
            else:
                # Model generated a final response
                return message.content

Multi-Step Tool Use

Chain of Tool Calls

DfTool Chaining

Tool chaining involves calling multiple tools in sequence, where the output of one tool feeds into the next. This enables complex workflows that no single tool can accomplish alone.

def multi_step_tool_use(user_query, llm, tools):
    """Execute a multi-step tool use workflow."""
    messages = [{"role": "user", "content": user_query}]
    tool_history = []
    
    for step in range(5):  # Max 5 tool use steps
        response = llm.chat(messages, tools=tools)
        
        if not response.tool_calls:
            return response.content, tool_history
        
        messages.append(response)
        
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)
            tool_history.append({
                "step": step,
                "tool": tool_call.function.name,
                "input": tool_call.function.arguments,
                "output": result
            })
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })
    
    return "Maximum tool use steps reached.", tool_history

Parallel Tool Calls

DfParallel Tool Calls

Parallel tool calls execute multiple independent tool calls simultaneously, reducing latency. The LLM generates multiple function calls in one response, and they are executed concurrently.

async def parallel_tool_calls(tool_calls):
    """Execute multiple tool calls in parallel."""
    tasks = []
    for tool_call in tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)
        tasks.append(execute_tool_async(function_name, arguments))
    
    results = await asyncio.gather(*tasks)
    return results

Error Handling

Retry Strategies

def execute_with_retry(tool_call, max_retries=3):
    """Execute a tool call with retry logic."""
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments)
    
    for attempt in range(max_retries):
        try:
            result = functions[function_name](**arguments)
            return {"result": result}
        except RateLimitError:
            time.sleep(2 ** attempt)  # Exponential backoff
        except InvalidInputError as e:
            return {"error": f"Invalid input: {e}"}
        except Exception as e:
            if attempt == max_retries - 1:
                return {"error": f"Tool execution failed: {e}"}
    
    return {"error": "Max retries exceeded"}

Error Feedback to LLM

DfError-Aware Tool Use

When a tool call fails, the error message should be fed back to the LLM so it can adjust its approach — retry with different parameters, try a different tool, or explain the limitation to the user.

def handle_tool_error(error, tool_call, messages, llm):
    """Handle tool execution errors by informing the LLM."""
    error_message = {
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps({
            "error": str(error),
            "suggestion": "Try rephrasing the query or using a different tool."
        })
    }
    messages.append(error_message)
    
    # Let the LLM decide how to proceed
    response = llm.chat(messages)
    return response

Practice Exercises

Tool Design: Design function schemas for 5 tools: web search, database query, calculator, unit converter, and email sender. Include parameter descriptions and examples.
Multi-Step Workflow: Implement a tool chain that searches for a product, compares prices across stores, and generates a purchase recommendation.
Error Handling: Test your tool use system with invalid inputs. How does the LLM respond to errors? Does it retry with correct parameters?
Parallel Execution: Implement parallel tool calls for a query that requires searching multiple databases simultaneously.

Key Takeaways

Summary: Tool Use and Function Calling

Function schemas define tools with structured descriptions for LLMs
Parameter extraction enables LLMs to fill in function arguments from natural language
Tool chaining connects multiple tool calls for complex workflows
Parallel tool calls execute independent calls simultaneously
Error handling feeds failures back to the LLM for recovery
Good descriptions are critical for correct tool selection
Max steps prevents infinite tool use loops
Tool use transforms LLMs from text generators into action-taking agents

What to Learn Next

-> LLM Agent Frameworks Building autonomous agents with LLMs.

-> Multi-Agent Systems Coordinating multiple agents for complex tasks.

-> Agentic RAG Systems Agent-based approaches to retrieval.

-> Planning and Reasoning in Agents How agents plan and execute multi-step tasks.

-> Building Production LLM Applications End-to-end production systems.

-> Prompt Engineering Getting the most out of language models.

Tool Use and Function Calling

Tool Use and Function Calling — LLMs That Act in the World

Tool Use and Function Calling

DfTool Use

Function Schema Design

OpenAI Function Calling Format

Tool Description Best Practices

Function Calling Implementation

Multi-Step Tool Use

Chain of Tool Calls

DfTool Chaining

Parallel Tool Calls

DfParallel Tool Calls

Error Handling

Retry Strategies

Error Feedback to LLM

DfError-Aware Tool Use

Practice Exercises

Key Takeaways

Summary: Tool Use and Function Calling

What to Learn Next

Need Expert LLM Help?