The Billion-Dollar Distraction
OpenAI spent what we can only assume is an astronomical sum building Sora. Google poured resources into Veo. Midjourney became the darling of tech Twitter. Runway raised hundreds of millions. The entire AI discourse in late 2023 and early 2024 was dominated by one question: which model makes the best images and videos?
Meanwhile, we were asking a different question: who is actually saving 20 hours a week with an image generator?
The answer, when we talked to actual small and mid-sized businesses, was almost nobody.
The Hype Cycle We Saw Coming
Here is what happened. Image generation hit a cultural moment. DALL-E went viral. Midjourney produced outputs that made people gasp. Social media was flooded with AI-generated art, and suddenly every conference keynote featured a slide about "the future of creative AI." The labs saw the attention and doubled down. The VCs saw the attention and wrote checks. The media saw the attention and wrote breathless articles.
But attention is not value. Virality is not utility. And the ability to generate a photorealistic image of a cat wearing a business suit -- while genuinely impressive from a technical standpoint -- does not move the needle for a 40-person accounting firm, a regional logistics company, or a chain of dental offices.
We watched this unfold with a mix of admiration for the technology and frustration at the resource allocation. The AI labs have finite capital and finite engineering talent. Every dollar and every researcher pointed at video generation is a dollar and a researcher not pointed at the problems that actually affect how businesses operate day to day.
Attention is not value. Virality is not utility. The ability to generate a photorealistic image of a cat wearing a business suit does not move the needle for a 40-person accounting firm.
Where the Real ROI Lives
Here is what we know from working with dozens of SMBs -- and what we laid out in our piece on AI strategy for SMBs -- the highest-ROI AI applications are boring. They are not the kind of thing that goes viral on social media or gets demoed at a keynote. They are text analysis, document processing, workflow automation, data extraction, email drafting, code generation, and structured reasoning over business data.
Let us put some numbers on this.
A property management company we work with processes around 200 lease agreements a year. Before AI, each lease review took a paralegal 4 to 6 hours. Our text-based AI agent does it in minutes. That is 800 to 1,200 hours saved per year. One agent. One application. Text processing.
A marketing agency we work with spends roughly 15 hours per week on client reporting -- pulling data from multiple platforms, synthesizing it into narratives, formatting presentations. Our reporting agent handles the data synthesis and narrative generation automatically. That is 60 hours a month reclaimed for actual creative and strategic work. Again, text processing.
A financial services firm uses our document analysis agent to review loan applications. It extracts key data points, cross-references them against internal guidelines, flags discrepancies, and generates a summary for the underwriter. What used to take 45 minutes per application now takes 5. Across 300 applications a month, that is a transformative change for a 12-person team. Text processing.
Now ask yourself: how many hours does any of these businesses save by generating AI images? Maybe their marketing person uses Midjourney for a social media post here and there. Call it one or two hours a week. The ROI comparison is not even close.
The Image AI Use Case Is Real But Narrow
We are not saying image and video AI are worthless. That would be a bad take and we try to avoid those. There are legitimate use cases:
- Marketing teams can generate social media imagery, ad variations, and concept mockups faster and cheaper than stock photography or freelance designers.
- Product teams can prototype visual concepts before committing to full production.
- E-commerce businesses can generate product photography variations at scale.
- Creative agencies can use image AI as a brainstorming tool for early concept development.
These are real applications that deliver real value. But they are a fraction of the total value that AI can deliver to a business. If you are a design agency, image AI might be central to your operations. If you are literally any other kind of business, it is a nice-to-have peripheral tool while text, reasoning, and automation are the main event.
The Labs Are Chasing Hollywood, Not Main Street
There is a revealing pattern in how the major AI labs allocate their resources. OpenAI partnered with Hollywood studios. Runway positioned itself as a filmmaking tool. Google Veo demos featured cinematic scenes. The aspiration is clear: they want to be the infrastructure that powers the entertainment industry.
That is a valid market. It is also a market that employs a tiny fraction of the workforce and represents a tiny fraction of GDP compared to the millions of SMBs that run on documents, emails, spreadsheets, and operational workflows.
The labs that win the long game -- as we argued in our follow-up on where the real business value in AI lives -- will not be the ones that make the best movie trailer. They will be the ones whose models save a million businesses 10 hours a week each. That is 10 million hours of productivity per week. The economic value of that dwarfs anything Hollywood will ever pay for AI video generation.
We think this is a capital allocation mistake that the market will eventually correct. But in the meantime, it means that the boring-but-valuable text and reasoning capabilities are somewhat under-invested relative to their economic potential. Which, frankly, is fine for companies like ours -- less competition in the space where the actual money is.
Why Businesses Fall for the Image AI Pitch
We get it. Image generation is visceral. You type a prompt, and in 30 seconds you have a picture. The output is visual, tangible, shareable. It is the kind of thing you show your spouse over dinner: "Look what AI can do."
Text-based AI is harder to demonstrate. "I fed it a 150-page contract and it extracted all the key terms and identified three clauses that deviate from our standard template" is not the kind of thing that gets likes on LinkedIn. But it is the kind of thing that saves a business $50,000 a year in paralegal hours.
We have seen this dynamic play out in client conversations. A business owner comes to us excited about image generation because that is what they have seen in the news. They want to know if AI can make their marketing materials. We say yes, it can, and that might save you a few hours a month. Then we ask: how much time does your team spend on data entry, document review, email management, report generation, and manual process coordination? The answer is always a staggering number. That is where the real opportunity is, and it does not require a single generated image.
What We Tell Our Clients
Our advice is consistent: start with the workflows, not the visuals. Map out where your team spends time on repetitive, text-heavy, or data-heavy tasks. That is where AI delivers transformative ROI -- and where the numbers consistently back up the investment. Image and video generation can be layer two or layer three of your AI adoption strategy -- the icing, not the cake.
Specifically, we recommend this prioritization for most SMBs:
- First priority: Document analysis and processing. Contracts, proposals, reports, compliance documents. The stuff that eats hours and requires careful attention to detail.
- Second priority: Communication automation. Email drafting, client responses, internal updates. High-volume, repetitive writing tasks that follow established patterns.
- Third priority: Data extraction and reporting. Pulling information from unstructured sources, synthesizing it into structured formats, generating narrative summaries.
- Fourth priority: Code and workflow automation. Building custom tools, automating multi-step processes, integrating systems.
- Fifth priority: Creative and visual applications. Marketing imagery, presentation graphics, social media content.
Notice that image generation is last. Not because it lacks value, but because the items above it deliver more value per dollar and per hour of implementation effort for the vast majority of businesses.
The Market Will Correct This
We are not worried about the hype cycle. Hype cycles correct themselves. The businesses that invested heavily in image generation tools without first addressing their core operational inefficiencies will eventually realize the mismatch. The ones that chased Sora demos instead of building document processing pipelines will see their competitors pull ahead on the metrics that actually matter: margins, throughput, response times, and operational leverage.
The labs will adjust too. As enterprise customers start spending real money on AI, the demand signal will shift from "make me a pretty picture" to "help me process 10,000 invoices a month." Follow the revenue, and you will see where the investment eventually goes.
In the meantime, if you are a business owner trying to figure out where to start with AI, ignore the flashy demos. Look at where your team spends their time. Look at where errors are costly. Look at where information moves slowly between people and systems. That is where AI changes everything. And none of it requires generating a single image.
Start with the workflows, not the visuals. Map out where your team spends time on repetitive, text-heavy, or data-heavy tasks. That is where AI delivers transformative ROI.