A model learns new tasks from prompt examples alone, without any weight updates.
In-context learning (ICL) is a capability of large language models (LLMs) in which the model adapts its behavior to a new task by conditioning on a handful of input-output examples embedded directly in the prompt, rather than through any update to its parameters. At inference time, the model reads the provided examples, infers the pattern or task structure they imply, and applies that understanding to a novel query — all without gradient descent or fine-tuning. The examples serve as implicit instructions, and the model's ability to exploit them emerges from the statistical regularities absorbed during large-scale pretraining.
The mechanics of ICL are still an active area of research, but leading hypotheses suggest that transformers implicitly implement a form of gradient-based learning in their forward pass through attention mechanisms. When given demonstrations, the model effectively performs a kind of "meta-learning" — recognizing task structure from the examples and generalizing accordingly. The number and quality of demonstrations matter considerably: zero-shot ICL provides only a task description, one-shot provides a single example, and few-shot provides several, with performance generally improving as more relevant examples are added.
ICL became practically significant with the release of GPT-3 in 2020, which demonstrated that a sufficiently large pretrained model could perform competitively on diverse benchmarks — translation, arithmetic, question answering — using only prompt-level conditioning. This was a striking departure from the prevailing paradigm of task-specific fine-tuning, and it catalyzed enormous interest in prompt engineering, chain-of-thought prompting, and retrieval-augmented generation as complementary techniques.
The importance of ICL lies in its flexibility and accessibility: practitioners can adapt a single frozen model to new tasks without expensive retraining, making deployment faster and more cost-effective. However, ICL has notable limitations — it is sensitive to example ordering and phrasing, constrained by context window length, and can be unreliable on tasks that require precise reasoning or domain knowledge not well-represented in pretraining data. Understanding when and why ICL succeeds or fails remains one of the central questions in modern LLM research.