Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. LSTM (Long Short-Term Memory)

LSTM (Long Short-Term Memory)

A recurrent neural network architecture that learns long-range dependencies in sequential data.

Year: 1997Generality: 838
Back to Vocab

Long Short-Term Memory (LSTM) networks are a specialized type of recurrent neural network (RNN) designed to overcome the fundamental limitation that plagued earlier sequence models: the inability to retain relevant information across long time spans. Standard RNNs suffer from the vanishing gradient problem, where gradients shrink exponentially during backpropagation through time, making it nearly impossible to learn connections between events separated by many steps. LSTMs, introduced by Sepp Hochreiter and Jürgen Schmidhuber in 1997, address this by replacing the simple recurrent unit with a more sophisticated memory cell capable of preserving state across hundreds or even thousands of timesteps.

The architecture's power comes from three learned gating mechanisms that control information flow. The forget gate decides what stored information to discard from the cell state. The input gate determines which new information to write into memory. The output gate controls what portion of the cell state is exposed as the unit's activation. Together, these gates allow the network to selectively remember, update, and expose information, giving it fine-grained control over what it retains across a sequence. This design creates a gradient highway through the cell state that resists vanishing, enabling stable learning over long contexts.

LSTMs became a dominant architecture throughout the 2000s and 2010s, achieving state-of-the-art results in speech recognition, machine translation, language modeling, handwriting recognition, and time series forecasting. Variants such as the peephole LSTM and the Gated Recurrent Unit (GRU) refined the original design, while bidirectional LSTMs extended the approach to process sequences in both directions simultaneously. Their ability to model temporal dependencies made them the go-to tool for any task where order and context matter.

Although Transformer-based architectures have largely supplanted LSTMs in natural language processing since the late 2010s, LSTMs remain widely used in domains where sequence length is moderate, computational resources are constrained, or streaming inference is required. They represent a foundational milestone in deep learning, demonstrating that neural networks could be engineered to maintain and exploit long-range memory in a principled, trainable way.

Related

Related

RNN (Recurrent Neural Network)
RNN (Recurrent Neural Network)

Neural networks with feedback connections that process sequential data using internal memory.

Generality: 838
xLSTM (Extended Long Short-Term Memory)
xLSTM (Extended Long Short-Term Memory)

A modernized LSTM architecture with exponential gating and parallelizable memory structures.

Generality: 420
Gating Mechanism
Gating Mechanism

A learned control system that selectively regulates information flow through a neural network.

Generality: 781
Sequential Models
Sequential Models

AI models that process ordered data by capturing dependencies across time or position.

Generality: 795
LNN (Liquid Neural Network)
LNN (Liquid Neural Network)

A recurrent neural network that continuously adapts its internal state to process time-varying data.

Generality: 339
Neural Long-Term Memory Module
Neural Long-Term Memory Module

An explicit memory subsystem enabling neural networks to store and retrieve information persistently.

Generality: 441