Manus Just Launched. Here Is Our Take.

The Most Hyped AI Launch of 2025. Does It Deliver?

Manus launched on March 5, 2025, and the internet lost its mind. Twitter was flooded with demo videos. Jack Dorsey praised it. Hugging Face's product lead called it impressive. The waitlist exploded. By the time we got access a few days later, we had already seen enough breathless takes to fill a novel.

Here is ours: Manus is genuinely interesting, occasionally impressive, and not ready for production work. That is not a dismissal. It is an honest assessment from a team that builds AI agents for a living and understands exactly what "autonomous" means in practice versus in a demo video.

We spent a week putting Manus through real business tasks - the same tasks we use to evaluate every AI tool that crosses our desk. Research projects, data analysis, content creation, multi-step workflows. Here is what we found.

Autonomy in a demo is easy. Autonomy in production, with real data, real edge cases, and real consequences for failure - that is the bar Manus has not cleared yet.

Artificial intelligence technology visualization

What Manus Is

Manus bills itself as a "general AI agent" - a system that can plan and execute multi-step digital tasks autonomously. You give it a goal in natural language, and it figures out the steps, executes them, and delivers a result. It was developed by Butterfly Effect (also known as Monica.im), a Wuhan-based startup legally registered in Singapore.

Under the hood, Manus uses multiple AI models - including Anthropic's Claude 3.5 Sonnet and fine-tuned versions of Alibaba's open-source Qwen - orchestrated through independently operating sub-agents. It can browse the web, write and execute code, create documents, manage files, and interact with web applications. The GAIA benchmark scores are strong: 86.5% on Level 1 tasks, 70.1% on Level 2, and 57.7% on Level 3, outperforming OpenAI's Deep Research on the standardized evaluation.

The concept is what we have been talking about in our work on what AI agents are and why they matter. A system that does not just answer questions but actually performs tasks. The difference between a chatbot and an agent, as we have explained in our chatbot vs. agent breakdown, is exactly what Manus is trying to collapse.

What Worked (About 60% of the Time)

Research and synthesis. We asked Manus to research the competitive landscape for a client in the logistics industry and produce a structured report. It browsed multiple sources, identified key players, pulled recent news, and delivered a formatted document. The quality was comparable to what a junior analyst would produce in two to three hours. It did this in about eight minutes.

Data collection and formatting. Manus handled structured data tasks reasonably well. Collecting pricing information from multiple vendor websites, organizing it into a comparison table, adding notes on feature differences. The output needed editing but the raw work was solid.

The planning phase is genuinely impressive. When you give Manus a complex task, it breaks it into a visible step-by-step plan before executing. You can watch it think through the approach, which builds confidence and lets you intervene early if the plan is misguided. This transparency is better than what most AI agents offer.

What Did Not Work

Reliability is the core problem. As independent reviewers have noted, Manus structures tasks well but stumbles in execution. It hallucinated data points in our competitive analysis. It created a comparison table with a vendor that does not exist. When we ran the same task three times, we got meaningfully different results each time. For production business work, this inconsistency is a dealbreaker.

Data handling raises questions. The company is registered in Singapore but developed in Wuhan, China. The terms of service and data processing agreements are not as transparent as what we see from Anthropic or even OpenAI. For clients who handle sensitive data - which is most of our clients - this creates a privacy and compliance concern we cannot ignore.

Access is severely limited. As of March 2025, fewer than 1% of users on the waitlist have received access. You cannot reliably plan business workflows around a tool you might not be able to use tomorrow. This alone disqualifies it from any production recommendation we would make to a client.

No ecosystem. Manus is a standalone tool. There are no APIs for integration, no plugin ecosystem, no way to embed it into existing business workflows. Compare this to Claude, which has an API, MCP servers for tool integration, Skills for custom workflows, and Claude Code for development. The ecosystem gap is massive.

Our Scorecard

MANUS SCORECARD - March 2025
========================================

Autonomy            [||||||||--]  8/10
  Genuinely plans and executes multi-step tasks.
  The planning phase is best-in-class.

Reliability         [||||||----]  6/10
  Works ~60% of the time. Hallucinations
  and inconsistent outputs are common.

Data Privacy        [||||------]  4/10
  Opaque data handling. China-based dev team,
  Singapore registration. No SOC 2 or clear
  data processing guarantees.

Ecosystem           [|||-------]  3/10
  No API. No integrations. No plugin system.
  Completely standalone.

Production-Ready    [||||------]  4/10
  Impressive demos, unreliable execution.
  Not ready for work where errors cost money.

OVERALL             [|||||-----]  5/10

Robot hand reaching forward representing AI agents

How It Compares to What We Use

We build AI agents using Claude Code and the Anthropic API. Our agents are purpose-built for specific workflows, integrated into client systems via MCP servers, and tested extensively before deployment. They are not general-purpose autonomous systems. They are targeted, reliable tools that do one thing well.

Manus takes the opposite approach: one general system that tries to do everything. There is philosophical appeal to this. But in practice, the generalist approach trades reliability for flexibility, and our clients cannot afford unreliable.

The analytical guides emerging from the consulting community generally agree: Manus is a research curiosity, not a production tool. The potential is clear. The execution is not there yet.

We do not recommend tools based on potential. We recommend tools based on what they can do reliably, today, with real client data. By that standard, Manus is a watch-list item, not a deployment recommendation.

Our Position

We are watching Manus closely. The autonomous agent space is exactly where AI is heading, and any tool that pushes the boundary of what agents can do independently deserves attention. But we are not recommending it for production use, and we will not until three things change:

First, reliability needs to cross the 90% threshold consistently. Sixty percent is interesting for a demo. It is unacceptable for client work.

Second, data handling needs to be transparent, auditable, and compliant with standards our clients require. SOC 2, clear data processing agreements, no ambiguity about where data goes.

Third, there needs to be an ecosystem - API access, integrations, a way to embed Manus capabilities into existing workflows rather than using it as an isolated tool.

Until then, we will keep building targeted, reliable agents with Claude and keep an eye on Manus for the day it closes the gap. The autonomous future is coming. It is just not here yet.

Manus Just Launched. Here Is Our Take.

The Most Hyped AI Launch of 2025. Does It Deliver?

What Manus Is

What Worked (About 60% of the Time)

What Did Not Work

Our Scorecard

How It Compares to What We Use

Our Position

Related Posts

AI for Accounting: The SMB Bookkeeping Guide

AI for Marketing Teams: The SMB Playbook

Three More Anthropic Releases. One Is Monumental.

Need help implementing AI?