Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Parameter Size

Parameter Size

The total count of learnable weights and biases in a machine learning model.

Year: 2018Generality: 694
Back to Vocab

Parameter size refers to the total number of learnable values — weights, biases, and other trainable quantities — contained within a machine learning model. Each parameter is a scalar value adjusted during training through optimization algorithms like stochastic gradient descent, collectively shaping how the model transforms inputs into outputs. In neural networks, parameters are distributed across layers as weight matrices and bias vectors, and their total count is determined by architectural choices such as layer depth, layer width, and connectivity patterns.

The relationship between parameter size and model capability is central to modern deep learning. Larger models can represent more complex functions and capture finer-grained patterns in data, which is why scaling parameter counts has driven many state-of-the-art results across vision, language, and multimodal tasks. GPT-3, for instance, contains 175 billion parameters — a scale that enables remarkably flexible language generation. However, more parameters demand proportionally more memory, compute, and training data, and they increase the risk of overfitting when data is scarce.

Parameter size became a defining concern in the era of large language models and foundation models, roughly from 2018 onward, when researchers began systematically studying how model performance scales with parameter count, dataset size, and compute budget. Scaling laws research demonstrated that these relationships follow predictable power-law curves, making parameter size a key variable in deliberate model design rather than an incidental outcome. This spurred both the race toward trillion-parameter models and a parallel effort toward parameter-efficient methods.

Because raw parameter count is expensive to deploy, significant research has focused on reducing effective parameter size without sacrificing performance. Techniques such as pruning, quantization, knowledge distillation, and parameter-efficient fine-tuning methods like LoRA allow practitioners to compress or adapt large models for resource-constrained environments. Understanding parameter size is therefore essential not only for training powerful models but for making them practical across a wide range of real-world applications.

Related

Related

Parameter
Parameter

A model-internal variable whose value is learned directly from training data.

Generality: 928
Parameterized Model
Parameterized Model

A model whose behavior is governed by learnable numerical values called parameters.

Generality: 875
Parameter Space
Parameter Space

The multidimensional space of all possible values a model's parameters can take.

Generality: 794
Overparameterized
Overparameterized

A model with more parameters than available training data points.

Generality: 590
Overparameterization Regime
Overparameterization Regime

When a model has more parameters than training samples, yet still generalizes well.

Generality: 520
Tunable Parameters
Tunable Parameters

Model variables adjusted during training to optimize performance on a given task.

Generality: 720