Ten Months Later, Here's What Changed
Back in September 2024, we wrote a post arguing that most AI labs were chasing the wrong problems. The flashy demos -- video generation, image creation, voice cloning -- were sucking up billions in capital while the actual business use cases remained underserved. We said the labs focused on reasoning and workflow automation would win the enterprise market.
Ten months later, let us look at what actually happened.
OpenAI shipped Sora. It generates impressive videos. It also has minimal business adoption, ongoing content moderation challenges, and a pricing model that makes no sense for any workflow we have encountered in two years of consulting. Every marketing agency we work with tried it. None of them use it in production. The gap between "impressive demo" and "useful business tool" turned out to be exactly as wide as we predicted.
Meanwhile, Claude went from a good chatbot to the backbone of our entire consulting practice -- something we unpacked in detail in our comparison of Claude vs. ChatGPT for business. Claude Code transformed how we build software. Opus 4 and Sonnet 4 handle complex business logic with a reliability that lets us deploy agents in production without losing sleep. Anthropic did not ship a single flashy consumer demo in that ten-month window. They shipped reasoning improvements, tool use capabilities, and developer infrastructure. Boring. Transformative.
A stunning AI-generated video gets 50 million views. A 15% reduction in document processing time gets zero views and $2 million in annual contract value.
The Distraction Economy in AI
Here is what the labs chasing pixels got wrong. They optimized for Twitter impressions instead of enterprise contracts. A stunning AI-generated video gets 50 million views. A 15% reduction in document processing time gets zero views and $2 million in annual contract value.
The incentive mismatch is real. Consumer AI captures attention. Enterprise AI captures revenue. And the labs that confused the former for the latter are now scrambling to pivot.
Look at the trajectory. Stability AI faced severe financial challenges chasing image generation and is a fraction of its former self. Runway is impressive but has not cracked sustainable unit economics for video generation. The "creative AI" wave generated enormous hype and modest revenue.
Now look at the other side. Anthropic's revenue growth has been staggering, driven almost entirely by API usage from businesses building real applications. Not content generation toys. Operational tools. The kind of software that processes invoices, analyzes contracts, manages customer communications, and automates the tedious work that costs businesses thousands of hours per year.
The VC Attention Shift
The venture capital market has caught up to this reality, and the numbers tell the story.
In early 2024, "AI for content creation" was the hottest category in venture funding. Every pitch deck featured generative media. By mid-2025, the VC conversation has shifted dramatically toward "AI for workflow automation," "AI agents for enterprise," and "vertical AI applications." The money follows the revenue, and the revenue is in business operations.
We see this in our own deal flow. A year ago, prospective clients would ask us about AI-generated marketing content. Today, they ask about AI agents that can handle their accounts receivable, manage their inventory reordering, or automate their client reporting. The questions got more specific, more operational, and more tied to measurable ROI.
This is not a subtle shift. It is a wholesale reorientation of where the smart money thinks AI value creation is happening.
Reasoning Engines vs. Pixel Machines
The fundamental thesis is simple: businesses do not need AI that creates. They need AI that thinks.
Creation -- generating images, videos, music, voices -- is a feature, not a platform. Features get commoditized. The cost of generating an AI image has dropped dramatically in two years and will continue falling. There is no durable competitive advantage in being marginally better at something that is rapidly approaching free.
Reasoning is different. The ability to read a complex document and extract the right information. The ability to follow a 3,000-word system prompt consistently across thousands of interactions. The ability to evaluate multiple options against business criteria and make a sound recommendation. The ability to write code that actually works in production.
These capabilities have compounding value. A reasoning engine that is 10% better at following instructions does not generate 10% more value -- it generates exponentially more valuebecause it unlocks use cases that were impossible at the lower capability level. An image generator that is 10% better at rendering hands is just... slightly better at rendering hands.
Anthropic understood this from day one. Their entire research agenda has been oriented around making Claude smarter, more reliable, and more steerable -- not more visually impressive. Every capability improvement they ship directly translates into new business applications. That is not an accident. That is a strategy.
What We Have Seen in the Field
In the last ten months, here is what has actually saved our clients money and time.
- An accounting firm automated 60% of their client document intake using a Claude-powered agent. Annual savings: six figures in staff time. Zero of that value came from image generation.
- A logistics company deployed an agent that monitors shipment status across four carrier APIs, identifies delays before they happen, and automatically communicates with affected customers. The agent handles hundreds of daily updates. Not a single pixel was generated.
- A law firm uses Claude to review incoming contracts against their standard terms, flagging deviations and generating redline recommendations. What took a junior associate two hours now takes eight minutes. The AI's value is entirely in reading, reasoning, and writing -- not in creating visual content.
- An e-commerce operation built a customer service agent that handles the majority of incoming inquiries without human intervention, with a customer satisfaction score actually higher than their human team's average. The agent's power is comprehension and judgment, not content creation.
Every single high-value deployment we have built is a reasoning application. Every one. And we are not unique -- this pattern is consistent across the industry.
Businesses do not need AI that creates. They need AI that thinks. Creation is a feature. Reasoning is a platform.
The Labs That Are Winning
Anthropic is the obvious example -- and a big part of why we bet on them early -- but they are not alone. The labs that are winning the enterprise market share a common trait: they prioritized reliability over spectacle.
Google's Gemini has made quiet, significant progress in enterprise applications. Their integration with Workspace tools -- analyzing spreadsheets, drafting documents from meeting transcripts, managing email workflows -- is the kind of unglamorous, high-value work that actually drives adoption.
The coding-focused AI companies -- Cursor, Replit, the tools powering vibe coding -- are building on reasoning models, not generative media models. The entire wave of "anyone can build software" is powered by AI that thinks, not AI that draws.
Even OpenAI seems to be recalibrating. Their recent focus on reasoning models (o-series), function calling improvements, and enterprise API features suggests they have recognized that the business market demands substance over style. Better late than never.
Doubling Down
The trend has only accelerated. The data from the last ten months is unambiguous.
The labs chasing pixels are losing the enterprise market to the labs building reasoning engines. Not slowly. Rapidly. Every quarter, the gap between "AI that impresses" and "AI that works" grows wider, and the businesses signing contracts are choosing the latter.
If you are evaluating AI for your business -- and our guide on how SMBs will adopt AI is a good starting point -- ignore the demos. Ignore the viral tweets. Ignore the "look what AI can generate" posts. Ask one question: can this AI reliably do boring, important work -- reading documents, following instructions, making decisions, processing data -- day after day, without supervision?
That is the bar. And right now, the labs that never cared about going viral are the ones clearing it.