Training ML models to generalize accurately from only a handful of labeled examples.
Few-shot learning is a machine learning paradigm that enables models to recognize patterns and make accurate predictions from only a small number of labeled training examples — typically between one and five per class. This stands in sharp contrast to conventional supervised learning, which demands thousands or millions of examples to achieve reliable performance. The core challenge is bridging the gap between the richness of human learning, where a child can identify a new animal from a single picture, and the data hunger of standard deep learning systems.
The dominant approaches to few-shot learning fall into three broad families. Meta-learning (or "learning to learn") trains a model across many related tasks so it develops an inductive bias that allows rapid adaptation to new tasks with minimal data — MAML (Model-Agnostic Meta-Learning) is a canonical example. Metric-based methods such as Siamese Networks, Matching Networks, and Prototypical Networks learn an embedding space where examples from the same class cluster together, enabling classification by nearest-neighbor comparison. Transfer learning approaches fine-tune large pretrained models on small target datasets, exploiting representations already learned from massive corpora.
Few-shot learning gained significant traction in the mid-2010s alongside advances in meta-learning and the proliferation of large pretrained models. The introduction of benchmark datasets like Omniglot and miniImageNet gave researchers standardized evaluation grounds, accelerating progress. More recently, large language models such as GPT-3 demonstrated remarkable few-shot capabilities through in-context learning — adapting to new tasks from just a few prompt examples without any weight updates at all, reshaping how the community thinks about the concept.
The practical importance of few-shot learning is substantial. In medicine, rare disease classification may yield only dozens of confirmed cases. In linguistics, thousands of languages lack sufficient digital text for standard training. In personalization, individual user data is inherently sparse. By enabling models to generalize from limited signal, few-shot learning extends the reach of AI into domains where large-scale data collection is economically, ethically, or logistically infeasible.