Credit Where It Is Due. OpenAI Shipped.
We build on Claude. We recommend Claude to our clients. We have written at length about why we bet on Anthropic over OpenAI. None of that has changed.
But intellectual honesty requires us to say this: 2025 was the most impressive product year in OpenAI's history. Not the most profitable. Not the most strategically sound. But in terms of sheer volume of genuinely useful products shipped, OpenAI ran the table.
From reasoning models that changed how we think about AI capabilities to a search product that is eating into Google's core business to an agent that can browse the web for you -- OpenAI shipped more consequential products in a single year than most AI companies ship in their entire existence. Ignoring that because we prefer a different platform would be doing our readers a disservice.
You can respect what a competitor built without abandoning your own platform. OpenAI shipped an extraordinary year of products. We still build on Claude. Both statements are true simultaneously.
The Product Timeline That Mattered
OpenAI Product Timeline: Late 2024 - 2025
===========================================
SEP 2024 o1-preview & o1-mini
First reasoning models. "Thinking" before responding.
Math: 83% on AIME (vs 13% for GPT-4o).
OCT 2024 ChatGPT Search (announced)
AI-powered web search inside ChatGPT.
Rolled out to all users by Feb 2025.
DEC 2024 Full o1 release
34% fewer errors than preview.
Available to Plus, Team, Enterprise.
JAN 2025 GPT Store expansion + Custom GPTs maturity
Marketplace hits critical mass.
10M+ custom GPTs created by users.
JAN 2025 Operator (announced Jan 23)
AI agent that browses the web for you.
Powered by Computer-Using Agent (CUA).
APR 2025 o3 + o4-mini
Next-gen reasoning. 88.9% on AIME 2025.
69.1% on SWE-bench Verified (coding).
MAY 2025 GPT-4o fully replaces GPT-4
Faster, cheaper, multimodal native.
Free tier access for all ChatGPT users.
JUN 2025 o3-pro
Highest-capability reasoning model.
Pro and Team tier exclusive.
AUG 2025 GPT-5
Unified model: reasoning + multimodal.
Merges GPT and o-series architectures.
The Reasoning Models Changed the Game
The o-series models are the headline story. When o1 launched in September 2024, it introduced a genuinely new paradigm: models that think before they respond. The benchmarks were not incremental improvements -- they were step-function jumps. PhD-level science reasoning. Competition-level mathematics. Coding performance that moved from the 11th percentile to the 89th on Codeforces.
Then came o3 in April 2025, which pushed the reasoning frontier even further -- 88.9 percent on the American Invitational Mathematics Examination, 69.1 percent on SWE-bench Verified for real-world coding tasks, and a three-times improvement over o1 on the ARC-AGI benchmark for abstract reasoning.
We wrote about the AI platform comparison earlier this year, and the reasoning models were a key differentiator for OpenAI in that analysis. Anthropic is building reasoning into Claude, but as of mid-2025, OpenAI's o-series remains the benchmark for raw reasoning capability.
ChatGPT Search Is a Legitimate Threat to Google
ChatGPT Search might be the most strategically important product OpenAI shipped all year. Announced in October 2024 and rolled out to all users by February 2025, it lets ChatGPT search the web in real time and synthesize results into direct answers with citations.
The user experience is better than Google for a significant category of queries. When you want a synthesized answer rather than a list of blue links, ChatGPT Search delivers. "Compare the pros and cons of React vs Next.js for a small SaaS." "What are the current HIPAA requirements for AI data handling?" These queries get better answers from ChatGPT Search than from traditional search.
For businesses, this has implications beyond convenience. As we discussed in our piece on the end of SaaS as we know it, the way people discover and evaluate products is shifting. If your potential customers are using ChatGPT Search to research solutions, your SEO strategy needs to account for AI-generated answers, not just traditional search rankings.
Operator: The Agent Bet
OpenAI launched Operator on January 23, 2025, as a research preview for Pro subscribers. It is an AI agent that can browse the web, fill out forms, navigate websites, and complete multi-step tasks autonomously.
The technology behind it -- the Computer-Using Agent model -- is genuinely impressive from a research perspective. It achieved an 87 percent success rate on the WebVoyager benchmark for web-based tasks. But in practice, the experience is inconsistent. It handles well-structured websites reasonably well and struggles with anything complex, dynamic, or authentication-gated.
Operator matters not because it is production-ready today, but because it signals OpenAI's commitment to the agent paradigm. Every major AI company is building agents. The question is whose agents will be reliable enough for business-critical workflows first. Based on what we have seen, Anthropic's approach -- building agent capabilities into Claude Code and the MCP ecosystem -- is more practically useful right now. But OpenAI is investing heavily, and Operator will improve.
Custom GPTs and the GPT Store
The GPT Store, which launched in January 2024, hit its stride in 2025. The ability to create custom GPTs without code -- essentially mini AI applications with specific instructions, knowledge bases, and tool access -- democratized AI tool creation in a way that matters for businesses.
We have seen non-technical business owners create functional customer service bots, internal process guides, and specialized research tools using custom GPTs. The quality ceiling is lower than what you get from a purpose-built Claude agent, but the barrier to entry is dramatically lower too. For quick internal tools and prototypes, custom GPTs are a legitimate option.
The Profitability Question
Here is where our admiration for OpenAI's product velocity meets our skepticism about their business model. As we noted in our analysis of the Sora shutdown, OpenAI is projecting massive losses. Shipping excellent products and losing money on every one of them is not a sustainable strategy.
The reasoning models are compute-intensive. ChatGPT Search requires real-time web crawling. Operator requires persistent browser sessions. Every product OpenAI shipped in 2025 costs more to run than the simpler models they replace. The question is not whether these products are good -- they are. The question is whether OpenAI can turn product excellence into a viable business before the cash runs out.
The gap between "best products shipped" and "best business built" is the gap between OpenAI and Anthropic. One company is winning on shipping. The other is winning on sustainability. In the long run, the business model wins.
What This Means for Our Clients
We still build on Claude. The reasoning is the same as it has always been: better instruction following, more consistent outputs, superior coding capabilities through Claude Code, and Anthropic's focus on enterprise reliability and data privacy.
But we now recommend that our clients keep ChatGPT in their toolkit for specific use cases. ChatGPT Search for research. Custom GPTs for quick internal tools. o3 for complex mathematical and scientific analysis where raw reasoning power matters more than speed.
The AI landscape in 2025 is not about picking one winner. It is about understanding what each platform does best and deploying accordingly. OpenAI had a phenomenal product year. Anthropic had a phenomenal business year. Google is quietly building the infrastructure layer. Businesses that thrive with AI will be the ones that learn to use all three intelligently.
Credit to OpenAI. 2025 was your best year. Now show us you can make money doing it.