Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Signal Scanfree
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. DP (Dynamic Programming)

DP (Dynamic Programming)

An optimization technique that solves complex problems by caching solutions to overlapping subproblems.

Year: 1990Generality: 838
Back to Vocab

Dynamic programming (DP) is an algorithmic technique for solving optimization and decision problems by decomposing them into smaller, overlapping subproblems, solving each subproblem exactly once, and storing the results in a lookup table or memoization cache. Rather than recomputing the same intermediate values repeatedly — as naive recursion would — DP reuses stored solutions, transforming many exponential-time algorithms into polynomial-time ones. The approach applies whenever a problem exhibits two key properties: optimal substructure (an optimal solution can be built from optimal solutions to subproblems) and overlapping subproblems (the same subproblems recur across different branches of computation).

In practice, DP can be implemented either top-down (recursive with memoization, where results are cached as they are first computed) or bottom-up (iterative tabulation, where smaller subproblems are solved first and results are stored in a table before tackling larger ones). Classic examples include computing Fibonacci numbers, shortest-path algorithms like Bellman-Ford, sequence alignment in bioinformatics, and the knapsack problem in combinatorial optimization. The technique is foundational to a wide range of algorithms across computer science and operations research.

In machine learning and reinforcement learning specifically, dynamic programming plays a central role. RL algorithms such as policy iteration and value iteration rely directly on DP principles to compute optimal policies in Markov decision processes (MDPs). Bellman's optimality equations — which express the value of a state as a function of the values of successor states — are the mathematical backbone of modern RL. Even deep RL methods like Q-learning and temporal difference learning are conceptually grounded in DP, approximating the Bellman equations when the state space is too large for exact tabular solutions.

Beyond reinforcement learning, DP appears in training sequence models, decoding algorithms for structured prediction (such as the Viterbi algorithm for hidden Markov models), and attention mechanisms in certain architectures. Its ability to make intractable recursive computations feasible makes it one of the most broadly applicable algorithmic paradigms in both classical and modern machine learning.

Related

Related

DRL (Deep Reinforcement Learning)
DRL (Deep Reinforcement Learning)

Neural networks combined with reinforcement learning to master complex sequential decision-making tasks.

Generality: 796
RL (Reinforcement Learning)
RL (Reinforcement Learning)

A learning paradigm where an agent maximizes cumulative rewards through environmental interaction.

Generality: 908
Q-Learning
Q-Learning

A model-free reinforcement learning algorithm that learns optimal action values through experience.

Generality: 792
DQN (Deep Q-Networks)
DQN (Deep Q-Networks)

Reinforcement learning method combining Q-learning with deep neural networks for complex environments.

Generality: 694
DL (Deep Learning)
DL (Deep Learning)

A machine learning approach using multi-layered neural networks to model complex data patterns.

Generality: 928
Bellman Equation
Bellman Equation

Recursive formula for computing optimal value functions in sequential decision-making.

Generality: 838