CW

Tool Use and Function Calling

LLM AgentsTool UseFree Lesson

Advertisement

LLM Agents

Tool Use and Function Calling — LLMs That Act in the World

Tool use transforms LLMs from text generators into action-taking agents. By defining functions with schemas, LLMs can call APIs, query databases, control robots, and interact with any external system.

  • Function Schemas — Define tools with structured descriptions
  • Parameter Extraction — LLMs extract parameters from natural language
  • Error Handling — Graceful recovery from tool execution failures

An LLM without tools is a brain without hands.

Tool Use and Function Calling

Tool use enables LLMs to go beyond text generation by interacting with external systems. The LLM receives a description of available tools, decides when and how to use them, and interprets the results to continue its reasoning.

DfTool Use

Tool use (or function calling) is the ability of an LLM to generate structured function calls that invoke external tools (APIs, databases, calculators, etc.). The LLM analyzes the user's request, determines which tool is appropriate, extracts the necessary parameters, and generates a function call in the expected format.

Function Schema Design

OpenAI Function Calling Format

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g., 'San Francisco, CA'"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"],
                        "description": "Temperature unit"
                    }
                },
                "required": ["location"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "search_database",
            "description": "Search a product database",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    },
                    "category": {
                        "type": "string",
                        "enum": ["electronics", "clothing", "food"],
                        "description": "Product category"
                    },
                    "max_price": {
                        "type": "number",
                        "description": "Maximum price filter"
                    }
                },
                "required": ["query"]
            }
        }
    }
]

Tool Description Best Practices

Good tool descriptions are critical for LLM tool selection. Include: (1) what the tool does, (2) when to use it, (3) what parameters mean, (4) example inputs and outputs.

# Good description
{
    "name": "get_stock_price",
    "description": "Retrieves the current stock price and basic metrics for a given ticker symbol. Use this when the user asks about stock prices, market value, or financial metrics of a company. Returns current price, daily change, volume, and market cap.",
    "parameters": {
        "ticker": {
            "type": "string",
            "description": "Stock ticker symbol (e.g., 'AAPL' for Apple, 'GOOGL' for Alphabet). Must be a valid US exchange ticker."
        }
    }
}

# Bad description (too vague)
{
    "name": "get_data",
    "description": "Gets some data",
    "parameters": {
        "input": {"type": "string"}
    }
}

Function Calling Implementation

import json
import openai

class ToolExecutor:
    def __init__(self, tools, functions):
        self.tools = tools
        self.functions = functions  # name -> callable mapping
    
    def execute(self, function_name, arguments):
        """Execute a function call."""
        if function_name not in self.functions:
            return {"error": f"Unknown function: {function_name}"}
        
        try:
            result = self.functions[function_name](**arguments)
            return {"result": result}
        except Exception as e:
            return {"error": str(e)}
    
    def run_with_tools(self, user_message, model="gpt-4"):
        """Run a conversation with tool use."""
        messages = [{"role": "user", "content": user_message}]
        
        while True:
            response = openai.ChatCompletion.create(
                model=model,
                messages=messages,
                tools=self.tools,
                tool_choice="auto"
            )
            
            message = response.choices[0].message
            
            # Check if the model wants to use tools
            if message.tool_calls:
                # Add assistant message with tool calls
                messages.append(message)
                
                # Execute each tool call
                for tool_call in message.tool_calls:
                    function_name = tool_call.function.name
                    arguments = json.loads(tool_call.function.arguments)
                    
                    result = self.execute(function_name, arguments)
                    
                    messages.append({
                        "role": "tool",
                        "tool_call_id": tool_call.id,
                        "content": json.dumps(result)
                    })
            else:
                # Model generated a final response
                return message.content

Multi-Step Tool Use

Chain of Tool Calls

DfTool Chaining

Tool chaining involves calling multiple tools in sequence, where the output of one tool feeds into the next. This enables complex workflows that no single tool can accomplish alone.

def multi_step_tool_use(user_query, llm, tools):
    """Execute a multi-step tool use workflow."""
    messages = [{"role": "user", "content": user_query}]
    tool_history = []
    
    for step in range(5):  # Max 5 tool use steps
        response = llm.chat(messages, tools=tools)
        
        if not response.tool_calls:
            return response.content, tool_history
        
        messages.append(response)
        
        for tool_call in response.tool_calls:
            result = execute_tool(tool_call)
            tool_history.append({
                "step": step,
                "tool": tool_call.function.name,
                "input": tool_call.function.arguments,
                "output": result
            })
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": json.dumps(result)
            })
    
    return "Maximum tool use steps reached.", tool_history

Parallel Tool Calls

DfParallel Tool Calls

Parallel tool calls execute multiple independent tool calls simultaneously, reducing latency. The LLM generates multiple function calls in one response, and they are executed concurrently.

async def parallel_tool_calls(tool_calls):
    """Execute multiple tool calls in parallel."""
    tasks = []
    for tool_call in tool_calls:
        function_name = tool_call.function.name
        arguments = json.loads(tool_call.function.arguments)
        tasks.append(execute_tool_async(function_name, arguments))
    
    results = await asyncio.gather(*tasks)
    return results

Error Handling

Retry Strategies

def execute_with_retry(tool_call, max_retries=3):
    """Execute a tool call with retry logic."""
    function_name = tool_call.function.name
    arguments = json.loads(tool_call.function.arguments)
    
    for attempt in range(max_retries):
        try:
            result = functions[function_name](**arguments)
            return {"result": result}
        except RateLimitError:
            time.sleep(2 ** attempt)  # Exponential backoff
        except InvalidInputError as e:
            return {"error": f"Invalid input: {e}"}
        except Exception as e:
            if attempt == max_retries - 1:
                return {"error": f"Tool execution failed: {e}"}
    
    return {"error": "Max retries exceeded"}

Error Feedback to LLM

DfError-Aware Tool Use

When a tool call fails, the error message should be fed back to the LLM so it can adjust its approach — retry with different parameters, try a different tool, or explain the limitation to the user.

def handle_tool_error(error, tool_call, messages, llm):
    """Handle tool execution errors by informing the LLM."""
    error_message = {
        "role": "tool",
        "tool_call_id": tool_call.id,
        "content": json.dumps({
            "error": str(error),
            "suggestion": "Try rephrasing the query or using a different tool."
        })
    }
    messages.append(error_message)
    
    # Let the LLM decide how to proceed
    response = llm.chat(messages)
    return response

Practice Exercises

  1. Tool Design: Design function schemas for 5 tools: web search, database query, calculator, unit converter, and email sender. Include parameter descriptions and examples.

  2. Multi-Step Workflow: Implement a tool chain that searches for a product, compares prices across stores, and generates a purchase recommendation.

  3. Error Handling: Test your tool use system with invalid inputs. How does the LLM respond to errors? Does it retry with correct parameters?

  4. Parallel Execution: Implement parallel tool calls for a query that requires searching multiple databases simultaneously.

Key Takeaways

Summary: Tool Use and Function Calling

  • Function schemas define tools with structured descriptions for LLMs
  • Parameter extraction enables LLMs to fill in function arguments from natural language
  • Tool chaining connects multiple tool calls for complex workflows
  • Parallel tool calls execute independent calls simultaneously
  • Error handling feeds failures back to the LLM for recovery
  • Good descriptions are critical for correct tool selection
  • Max steps prevents infinite tool use loops
  • Tool use transforms LLMs from text generators into action-taking agents

What to Learn Next

-> LLM Agent Frameworks Building autonomous agents with LLMs.

-> Multi-Agent Systems Coordinating multiple agents for complex tasks.

-> Agentic RAG Systems Agent-based approaches to retrieval.

-> Planning and Reasoning in Agents How agents plan and execute multi-step tasks.

-> Building Production LLM Applications End-to-end production systems.

-> Prompt Engineering Getting the most out of language models.

Advertisement

Need Expert LLM Help?

Get personalized tutoring, RAG system design, or production LLM consulting.

Advertisement