A structured request to retrieve information or interact with an AI model.
In machine learning and AI systems, a query is a structured input used to retrieve information, trigger computation, or elicit a response from a model or data system. Queries can take many forms — natural language questions posed to a large language model, vector embeddings submitted to a similarity search index, or SQL statements sent to a relational database. What unifies these uses is the idea of a caller specifying what they want, with the system responsible for finding or generating a matching response.
The concept gained particular prominence in the attention mechanism introduced in transformer architectures. In this context, a query is a learned vector representation derived from one token or position in a sequence, which is compared against a set of key vectors to determine how much attention to pay to each corresponding value. This query-key-value (QKV) framework is central to how transformers process language, images, and other sequential data, enabling models to dynamically focus on relevant parts of their input regardless of distance in the sequence.
In information retrieval and retrieval-augmented generation (RAG) systems, queries serve as the bridge between a user's intent and a knowledge base. A user's natural language question is typically encoded into a dense vector embedding, then matched against a corpus of similarly encoded documents using approximate nearest-neighbor search. The quality of the query — how well it captures the user's actual intent — directly determines the relevance of retrieved content and, downstream, the quality of generated answers.
Queries matter because they define the interface between human intent and machine capability. Poorly formed queries yield irrelevant or misleading results, while well-designed query mechanisms — whether through prompt engineering, embedding models, or structured query languages — dramatically improve system utility. As AI systems become more capable of interpreting ambiguous or complex requests, the design of query interfaces has become a significant area of research and engineering, touching on topics from semantic search to conversational AI.