An AI system's ability to apply knowledge learned in one domain to another.
Transfer capability refers to the capacity of a machine learning model to leverage representations, patterns, or knowledge acquired during training on one task and apply them effectively to a different but related task. Rather than learning from scratch each time a new problem is encountered, a model with strong transfer capability can reuse what it has already learned, dramatically reducing the need for large labeled datasets and extensive computational resources in the target domain.
At a mechanistic level, transfer capability emerges because many tasks share underlying structure. In deep neural networks, early layers tend to learn general features — edges in images, syntactic patterns in text — while later layers capture task-specific abstractions. When a model is pre-trained on a large, rich dataset and then fine-tuned on a smaller target dataset, the general features transfer well while the task-specific layers adapt. The degree to which this transfer succeeds depends on the similarity between source and target domains, the architecture of the model, and the fine-tuning strategy employed.
Transfer capability is foundational to modern machine learning practice. It underlies the success of large pre-trained models such as BERT in natural language processing and ResNet in computer vision, where models trained on massive corpora or image datasets are routinely adapted to specialized downstream tasks with minimal additional training. Without transfer capability, the cost of training high-performing models for every new application would be prohibitive for most practitioners and organizations.
The concept also has important implications for AI robustness and generalization research. A model that transfers well is implicitly learning representations that are not narrowly overfit to a single distribution, suggesting broader applicability and resilience to domain shift. Measuring and improving transfer capability remains an active research area, with work spanning few-shot learning, domain adaptation, and meta-learning — all of which seek to push the boundaries of how flexibly learned knowledge can be redeployed across contexts.