Skip to main content

Envisioning is an emerging technology research institute and advisory.

LinkedInInstagramGitHub

2011 — 2026

research
  • Reports
  • Newsletter
  • Methodology
  • Client Impact
services
  • Research Sessions
  • Signals Workspace
  • Bespoke Projects
  • Use Cases
  • Free Signal Scan
  • Free Readiness Assessment
PricingPartners
about
  • Manifesto
  • Community
  • Events
  • Support
  • Contact
  • Login
ResearchServicesPricingPartnersAbout
ResearchServicesPricingPartnersAbout

Newsletter

An Apocaloptimist — Artificial Insights | Envisioning
BACK TO NEWSLETTER

An Apocaloptimist

Issue 130 · February 23, 2026

A question I’ve been thinking about this week: Are AI tokens massively underpriced right now?
When you look at the full AI stack — multi-billion dollar training runs, energy-hungry data centers, specialized chips, safety teams, infrastructure build-outs — the current cost per token feels surprisingly low.
Why?
Because pricing today is shaped less by marginal cost and more by strategy.
We are in a land-grab phase.
Providers are compressing prices to:
  • Expand usage.
  • Attract developers.
  • Lock in ecosystems.
  • Build habits around their APIs.
In other words, growth first. Margin later.
But there’s a second layer to this story.
The marginal cost of inference has fallen dramatically:
  • Better hardware.
  • Quantization and distillation.
  • Smarter routing between large and small models.
  • More efficient architectures.
So while the full system is expensive, the cost of producing one more token has been falling fast.
That creates an interesting tension.
Over the next 5–10 years, two things will likely happen at the same time:
  1. Commodity inference will get dramatically cheaper. Many tasks will run on smaller, specialized models. Local inference will improve. The price per token for everyday reasoning should fall sharply.
  2. Frontier cognitive capability will be priced strategically. If advanced systems begin to automate high-value cognitive work, pricing won’t revolve around tokens. It will revolve around value: agent hours, workflow automation, productivity replacement.
In that world, tokens stop being the real unit of economics.
We’ve seen this pattern before.
Cloud storage became cheaper per gigabyte, but total cloud spending exploded. The cheap layer expanded the market. The premium layer captured margin.
AI may follow a similar structure:
  • Cheap, abundant inference at the base.
  • Premium, high-capability reasoning at the top.
  • Vertical AI systems capturing the largest share of value.
So will tokens be cheaper in 10 years?
Almost certainly, at the commodity level.
Will advanced AI be economically cheap?
Unlikely.
The more interesting question is who captures the productivity surplus created by AI — model providers, application builders, or end users.
If the gains are large enough, token pricing becomes secondary.
We may look back at today’s pricing as the phase where providers were subsidizing adoption in order to define the future rails of cognition.
MZ

Video Links

The AI-Panic Cycle — Anil Dash in The Atlantic
“A huge part of the cultural tension around these things is everybody advocating them is like why wouldn’t you love this and everybody whose industry is being….”
The AI Agent Economy Is — Y Combinator
“For one thing, Claude Code has totally taken over my life.”
These AI Prompts Exposed My — Daniel Pink
“Most people use AI to write emails or summarize articles.”
How to Build an AI — Peter Yang
“I want you to create a product that you can build entirely on your own that will make money.”
This closely matches my experience
“Well, sitting on my desk is a new Mac Mini that I set up just for the purpose of running my team of AI agents using OpenClaw.”
A very cool episode of How I AI
“Right before I started college, I ended up losing most of my central vision due to a rare genetic disorder called Liber’s hereditary optic neuropathy.”
THE AI DOC: OR HOW — Focus Features
“If this technology goes wrong, it can go quite wrong.”

Apple TV recently announced a Neuromancer series, which feels like a good excuse to share Wintermute — our research hub on AI systems, autonomous agents, and synthetic cognition, named after one of the AIs in the book. Gibson imagined most of this in 1984. We’re now tracking it as emerging infrastructure. Some things worth exploring inside: wafer-scale AI systems, edge neuromorphic processors, and photonic accelerators. Share with anyone who’s read the book — or should.
envisioning.com/wintermute

If Artificial Insights resonates with you, please help us out by:
  • Subscribing to the weekly newsletter on Substack.
  • Following the weekly newsletter on LinkedIn.
  • Forwarding this issue to colleagues and friends.
  • Sharing on your socials.
Artificial Insights is written by Michell Zappa, CEO of Envisioning.

More from Artificial Insights

Mar 8, 2026 · Issue 131
Mar 8, 2026 · Issue 131
Prompt it into existence
Feb 9, 2026 · Issue 129
Feb 9, 2026 · Issue 129
Agent in the Loop
View all issues

Newsletter

Follow us for weekly foresight in your inbox.