Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Random Forest

Random Forest

An ensemble of decision trees that improves accuracy and resists overfitting.

Year: 2001Generality: 796
Back to Vocab

Random Forest is an ensemble learning method that builds a large collection of decision trees during training and aggregates their outputs to produce a final prediction. For classification tasks, the algorithm returns the majority vote across all trees; for regression, it returns the mean of individual tree predictions. This aggregation strategy, known as bagging, dramatically reduces the variance that plagues individual decision trees without substantially increasing bias, yielding a model that generalizes far better to unseen data.

The algorithm introduces randomness at two distinct levels to ensure that the constituent trees remain diverse and decorrelated. First, each tree is trained on a bootstrap sample — a random subset of the training data drawn with replacement. Second, at every node split, only a random subset of features is considered as candidates for the best split, rather than evaluating all available features. This feature subsampling is the key innovation that distinguishes Random Forest from simple bagging of decision trees, and it prevents any single dominant feature from appearing in every tree, forcing the ensemble to explore a wider range of predictive patterns.

Random Forest became a cornerstone of practical machine learning after Leo Breiman formalized and published the algorithm in 2001, demonstrating its superior accuracy and robustness across a wide range of benchmark tasks. It requires minimal hyperparameter tuning compared to many competing methods, handles high-dimensional data gracefully, and is naturally resistant to overfitting even as the number of trees grows. These properties made it one of the most widely adopted algorithms in applied data science before the deep learning era, and it remains highly competitive on structured tabular data today.

Beyond raw predictive performance, Random Forest provides a built-in mechanism for estimating feature importance by measuring how much each variable reduces impurity across all splits in all trees. This interpretability makes it valuable not just as a predictive tool but as an exploratory instrument for understanding which input variables carry the most predictive signal — a property that has made it popular in domains such as genomics, finance, and clinical medicine where understanding model decisions is as important as accuracy.

Related

Related

Bagging
Bagging

Ensemble method that trains multiple models on random data subsets and aggregates predictions.

Generality: 694
Ensemble Algorithm
Ensemble Algorithm

Combines multiple models to boost predictive accuracy, robustness, and generalization.

Generality: 796
Ensemble Methods
Ensemble Methods

Combining multiple trained models to produce predictions stronger than any single model.

Generality: 771
Ensemble Learning
Ensemble Learning

Combining multiple models to produce predictions more accurate than any single model.

Generality: 836
Decision Tree
Decision Tree

A tree-structured model that makes predictions through sequential feature-based splits.

Generality: 838
Out-of-Bag Evaluation
Out-of-Bag Evaluation

A built-in validation method for ensemble models using bootstrap sampling's unused data.

Generality: 492