Tool Use in LLMs: Function Calling, MCP, and the Agentic Future

Beyond Text Generation

A language model that can only generate text is limited. Real-world tasks often require actions: searching the web, querying a database, calling an API, executing code, reading a file. Tool use — the ability to invoke external functions and use their results — is what transforms a language model from a conversational toy into a capable AI agent.

Function Calling: The Foundation

OpenAI introduced "function calling" in June 2023, and it quickly became an industry standard (now also called "tool use" in Anthropic's API). The mechanism is straightforward: the developer provides a list of functions the model can call, each with a JSON schema describing parameters. The model outputs a structured JSON call instead of natural language when it determines a tool is needed. The developer executes the tool, returns the result, and the model continues.

tools = [{
"name": "search_web",
"description": "Search the web for current information",
"input_schema": {
"type": "object",
"properties": {
  "query": {"type": "string", "description": "Search query"},
  "n_results": {"type": "integer", "default": 5}
},
"required": ["query"]
}
}]

Function calling enables models to ground their responses in real-time information, perform computations, access databases, and interact with external services.

Parallel Tool Use

Modern models (Claude 3+, GPT-4o) support parallel tool calls: the model can decide to call multiple tools simultaneously in a single turn, reducing the number of round trips needed to complete a task. For tasks that require multiple independent lookups — "get the weather in three cities simultaneously" — parallel tool use can reduce latency by 3× or more.

Model Context Protocol (MCP)

Function calling solves the individual tool invocation problem but doesn't address the broader question: how do AI applications discover, connect to, and securely use tools and data sources at scale? The Model Context Protocol (MCP), developed by Anthropic and released as an open standard in late 2024, addresses this.

MCP defines a standard client-server protocol for connecting AI applications to "MCP servers" — lightweight services that expose tools, resources (readable content), and prompts through a standardized interface. An MCP server for GitHub exposes tools for reading and writing code; an MCP server for a database exposes query and update operations. An LLM application connects to multiple MCP servers and uses their tools transparently.

The key advantages of MCP over ad-hoc function definitions:

Discoverability: Tools are described in a machine-readable format that can be used to automatically configure the AI client.
Reusability: An MCP server for a database works with any MCP-compatible AI client — you build it once.
Security: MCP's authorization model (OAuth 2.1-based) enables secure access control without building custom auth for each integration.
Ecosystem: The MCP ecosystem had over 500 public MCP servers by mid-2025, covering most major services.

The Agentic Loop

Tool use turns a single-turn model call into a multi-step agentic loop:

User provides a task
Model decides which tool(s) to call
Developer executes tool(s), returns results
Model incorporates results and decides: is the task complete? If not, go to 2.
Model generates final response

This loop is the foundation of autonomous AI agents. The sophistication of the agent is determined by the quality of the model's planning (when to use which tool), the breadth of available tools, and the model's ability to synthesize tool outputs into coherent actions.

Reliability Challenges

Tool use introduces reliability challenges not present in plain text generation. The model must correctly parse tool schemas, provide valid arguments, and handle tool failures gracefully. Common failure modes: calling non-existent tools (hallucinating tool names), providing arguments that fail schema validation, and getting stuck in loops when tools return unexpected results.

Best practices for robust tool use: validate all tool calls before execution; return structured error messages (not just exceptions) when tools fail; implement maximum iteration limits to prevent infinite loops; log all tool calls for debugging.