A model's structured output naming a tool and its arguments — requiring a harness to execute it.
A tool call is the structured output produced by an AI model when it decides to invoke a tool — naming which tool to run and providing arguments for it. It is text emitted by the model in response to a prompt, not an action performed by the system. The model produces the call; the harness reads it, executes the named tool with the given arguments, and returns the result as a tool result. Without harness execution, a tool call does nothing.
The tool call takes the form of a structured annotation within the model's output stream — a JSON-like object or function-calling structure that the harness can parse reliably. It is bounded by the context window like any other token sequence, and generating it consumes the same attention budget as prose output.
The key confusion in practice is distinguishing between a model describing an action and a model actually calling a tool. A model may say "I ran the tests" in natural language without ever producing a tool call, because the harness never executed anything. Verifying that a tool call was actually emitted — and that the harness actually executed it — requires inspecting the full transcript.
Open questions include how to make tool call boundaries more legible to end users who may not read transcripts, and whether models can be made to more reliably emit tool calls when they say they have taken an action rather than merely describing it.