Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Observatory
  • Newsletter
  • Methodology
  • Origins
  • Vocab
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Readinessfree
impact
  • ANBIMAFuture of Brazilian Capital Markets
  • IEEECharting the Energy Transition
  • Horizon 2045Future of Human and Planetary Security
  • WKOTechnology Scanning for Austria
audiences
  • Innovation
  • Strategy
  • Consultants
  • Foresight
  • Associations
  • Governments
resources
  • Pricing
  • Partners
  • How We Work
  • Data Visualization
  • Multi-Model Method
  • FAQ
  • Security & Privacy
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout
  1. Home
  2. Vocab
  3. Value Function

Value Function

A function estimating expected cumulative reward from a given state or action.

Year: 1989Generality: 842
Back to Vocab

In reinforcement learning (RL), a value function is a mathematical function that estimates the expected cumulative future reward an agent can obtain from a given situation. There are two primary variants: the state-value function V(s), which estimates the total reward expected when starting from state s and following a particular policy, and the action-value function Q(s, a) — often called the Q-function — which estimates the expected return when taking action a in state s and then following a policy thereafter. Together, these functions give an agent a way to evaluate how "good" any given situation or decision is in the long run, not just immediately.

Value functions are computed with respect to a policy — a strategy that dictates how the agent behaves. The Bellman equations, a set of recursive relationships, form the mathematical backbone of value function estimation. They express the value of a state as the immediate reward plus the discounted value of the next state, allowing values to be propagated backward through time. Algorithms like dynamic programming, temporal difference (TD) learning, and Monte Carlo methods all leverage these equations to iteratively refine value estimates from experience or simulation.

The practical importance of value functions in modern ML is enormous. Deep Q-Networks (DQN), introduced by DeepMind, approximated the Q-function with a deep neural network, enabling agents to master Atari games directly from raw pixels. Actor-critic architectures — foundational to state-of-the-art algorithms like PPO and SAC — use a learned value function (the "critic") to reduce variance in policy gradient estimates, dramatically improving training stability and sample efficiency. Value functions also underpin many approaches to reward shaping, exploration strategies, and safe RL.

Beyond game-playing, value functions are central to real-world RL applications including robotics, recommendation systems, and fine-tuning large language models via reinforcement learning from human feedback (RLHF). Accurately estimating value — especially in high-dimensional, continuous spaces — remains one of the core challenges of the field, driving ongoing research into better function approximators, off-policy correction methods, and uncertainty-aware value estimation.

Related

Related

Q-Value
Q-Value

Expected cumulative reward for taking an action in a given state under a policy.

Generality: 756
Q-Learning
Q-Learning

A model-free reinforcement learning algorithm that learns optimal action values through experience.

Generality: 792
RL (Reinforcement Learning)
RL (Reinforcement Learning)

A learning paradigm where an agent maximizes cumulative rewards through environmental interaction.

Generality: 908
Utility Function
Utility Function

A mathematical function that quantifies an agent's preferences to guide optimal decision-making.

Generality: 720
Bellman Equation
Bellman Equation

Recursive formula for computing optimal value functions in sequential decision-making.

Generality: 838
DQN (Deep Q-Networks)
DQN (Deep Q-Networks)

Reinforcement learning method combining Q-learning with deep neural networks for complex environments.

Generality: 694