所有人都在谈论的AI新秀——Manus,实测如何?

分享

其他推荐

Everyone in AI is talking about Manus. We put it to the test.

(图片来自MIT Technology Review)

Since the general AI agent Manus was launched last week, it has spread online like wildfire. And not just in China, where it was developed by the Wuhan-based startup Butterfly Effect. It’s made its way into the global conversation, with influential voices in tech, including Twitter cofounder Jack Dorsey and Hugging Face product lead Victor Mustar, praising its performance. Some have even dubbed it “the second DeepSeek,” comparing it to the earlier AI model that took the industry by surprise for its unexpected capabilities as well as its origin.

自通用智能体Manus上周发布以来,它便以燎原之势席卷网络。不仅在其研发地中国(由武汉的初创公司“蝴蝶效应”开发),它还引发了全球科技界的关注。推特(Twitter)联合创始人杰克·多西(Jack Dorsey)和抱脸网(Hugging Face)产品负责人维克多·穆斯塔法(Victor Mustar)等科技界知名人士都对它的表现赞不绝口。有人甚至将Manus称为“第二个DeepSeek”,将其与这一早前发布的人工智能模型相提并论,后者因其出人意料的能力和公司背景而令业界大吃一惊。Manus claims to be the world’s first general AI agent, leveraging multiple AI models (such as Anthropic’s Claude 3.5 Sonnet and fine-tuned versions of Alibaba’s open-source Qwen) and various independently operating agents to act autonomously on a wide range of tasks. (This makes it different from AI chatbots, including DeepSeek, which are based on a single large language model family and are primarily designed for conversational interactions.)
Manus自称是全球首个通用智能体,整合了多个AI模型(如Anthropic的Claude 3.5 Sonnet、阿里巴巴开源模型Qwen的微调版本)及许多能独立自主处理多任务的操作体。(与基于单一语言模型的聊天机器人如DeepSeek不同,后者主要用于对话交互)。
Despite all the hype, very few people have had a chance to use it. Currently, under 1% of the users on the wait list have received an invite code. (It’s unclear how many people are on this list, but for a sense of how much interest there is, Manus’s Discord channel has more than 186,000 members.)
尽管备受热议,但真正体验过Manus的用户却寥寥无几。目前,等待名单中仅有不到1%的人获得了邀请码(具体人数未公开,但Manus的Discord频道已超18.6万会员,可见其热度)。
MIT Technology Review was able to obtain access to Manus, and when I gave it a test-drive, I found that using it feels like collaborating with a highly intelligent and efficient intern: While it occasionally lacks understanding of what it’s being asked to do, makes incorrect assumptions, or cuts corners to expedite tasks, it explains its reasoning clearly, is remarkably adaptable, and can improve substantially when provided with detailed instructions or feedback. Ultimately, it’s promising but not perfect.
《麻省理工科技评论》(MIT Technology Review)获得了测试资格。运行测试时,研究员感觉像是在和一位高智商高效率实习生合作:偶尔会误解指令、做出错误假设或为赶工偷懒,但能清晰解释逻辑、快速适应需求,并在收到详细反馈后显著改进。总体而言,它潜力十足但尚未完美。
To put it to the test, I gave Manus two assignments: (1) compile a list of notable reporters covering China tech, (2) search for two-bedroom property listings in New York City.
为验证其实力,研究员为Manus布置了两项任务:(1) 整理中国科技领域知名记者名单,(2) 搜索纽约市两居室房源。

Here’s how it did:具体如下:
01任务1

Task 1: The first list of reporters that Manus gave me contained only five names, with five “honorable mentions” below them. I noticed that it listed some journalists’ notable work but didn’t do this for others. I asked Manus why. The reason it offered was hilariously simple: It got lazy. It was “partly due to time constraints as I tried to expedite the research process,” the agent told me. When I insisted on consistency and thoroughness, Manus responded with a comprehensive list of 30 journalists, noting their current outlet and listing notable work. (I was glad to see I made the cut, along with many of my beloved peers.)

任务1: Manus第一次给出的名单列了5位记者,和5位“优秀记者”。研究员注意到它列出了部分记者的著名作品,但其他记者的代表作信息缺失。询问原因时,它的回答令人哭笑不得:因为懒。智能体答道:“由于时间限制,所以我加快了调研流程。”当研究员坚持要求一致性和完整性后,它迅速提交了一份包含30名记者的详细清单,标注了所属媒体及代表作。(研究员本人及多位同行荣幸上榜)
I was impressed that I was able to make top-level suggestions for changes, much as someone would with a real-life intern or assistant, and that it responded appropriately. And while it initially overlooked changes in some journalists’ employer status, when I asked it to revisit some results, it quickly corrected them. Another nice feature: The output was downloadable as a Word or Excel file, making it easy to edit or share with others.

Manus对于高要求的修改回应,就像实习生或助理一样恰当。当研究员提出要补充部分记者最新职务的要求,Manus也迅速修正了。另一个特点是,Manus输出结果可直接下载为Word或Excel文件,便于编辑或分享。
Manus hit a snag, though, when accessing journalists’ news articles behind paywalls; it frequently encountered captcha blocks. Since I was able to follow along step by step, I could easily take over to complete these, though many media sites still blocked the tool, citing suspicious activity. I see potential for major improvements here—and it would be useful if a future version of Manus could proactively ask for help when it encounters these sorts of restrictions.

但在处理付费墙文章时,Manus频繁遭遇验证码拦截,因为许多媒体站点会将它列为可疑工具。好在研究员是逐步对它提出的要求,就算这一步遇到障碍,研究员很容易就能继续接手完成后面的工作。未来若Manus再遇到类似问题,能主动寻求帮助,性能的大幅提升是非常有可能的。
02任务2Task 2: For the apartment search, I gave Manus a complex set of criteria, including a budget and several parameters: a spacious kitchen, outdoor space, access to downtown Manhattan, and a major train station within a seven-minute walk. Manus initially interpreted vague requirements like “some kind of outdoor space” too literally, completely excluding properties without a private terrace or balcony access. However, after more guidance and clarification, it was able to compile a broader and more helpful list, giving recommendations in tiers and neat bullet points.

任务2:在搜索房源时,研究员提出了一套复杂要求,包括预算和几个参数:宽敞厨房、户外空间、临近曼哈顿市中心且7分钟步行可达地铁站。Manus起初将“一些户外空间”机械理解为“私人露台或阳台”,完全排除了无此类设施的房源。但是,经细化指令后,它能够整理出一份更宽泛、更有用的清单,分层并以整齐的点句符号列出建议。
The final output felt straight from Wirecutter, containing subtitles like “best overall,” “best value,” and “luxury option.” This task (including the back-and-forth) took less than half an hour—a lot less time than compiling the list of journalists (which took a little over an hour), likely because property listings are more openly available and well-structured online.

最终结果是排版简洁的Wirecutter风格,包括“最佳综合”“最佳性价比”“豪华选项”等类别。此项任务耗时不到半小时(包括细化指令的时间),远少于记者名单的一小时——或因房源数据更公开更系统。
Still, it’s not all smooth sailing. Manus can suffer from frequent crashes and system instability, and it may struggle when asked to process large chunks of text. The message “Due to the current high service load, tasks cannot be created. Please try again in a few minutes” flashed on my screen a few times when I tried to start new requests, and occasionally Manus’s Computer froze on a certain page for a long period of time.

不过,一切也并非一帆风顺。Manus目前存在明显短板:频繁崩溃、系统卡顿、处理长文本时也易出错。当研究员尝试开始新的请求时,屏幕上闪过几次 “当前服务繁忙,请稍后重试”的字样,偶尔电脑界面也会长时间冻结在某个Manus页面上。结论It has a higher failure rate than ChatGPT DeepResearch—a problem the team is addressing, according to Manus’s chief scientist, Peak Ji. That said, the Chinese media outlet 36Kr reports that Manus’s per-task cost is about $2, which is just one-tenth of DeepResearch’s cost. If the Manus team strengthens its server infrastructure, I can see the tool becoming a preferred choice for individual users, particularly white-collar professionals, independent developers, and small teams.

其首席科学家季逸超(Peak Ji)坦言,Manus故障率高于ChatGPT DeepResearch,团队正着手解决。但据中国媒体36氪(36Kr)报道,Manus的单任务成本仅2美元,为DeepResearch的十分之一。若优化服务器,它或将成为白领、独立开发者及小团队的高性价比选择。
Finally, I think it’s really valuable that Manus’s working process feels relatively transparent and collaborative. It actively asks questions along the way and retains key instructions as “knowledge” in its memory for future use, allowing for an easily customizable agentic experience. It’s also really nice that each session is replayable and shareable.

最后,研究员认为Manus的工作流程相对透明、协作性强,这一点非常宝贵。它会主动提问、记录关键指令作为“知识”供后续调用。这种协作感使其更像一位可定制的智能伙伴。每个环节都可以回顾和分享,这一点也非常棒。
I expect I will keep using Manus for all sorts of tasks, in both my personal and professional lives. While I’m not sure the comparisons to DeepSeek are quite right, it serves as further evidence that Chinese AI companies are not just following in the footsteps of their Western counterparts. Rather than just innovating on base models, they are actively shaping the adoption of autonomous AI agents in their own way.

研究员希望能在生活和工作中继续使用 Manus 应对各种任务。尽管“第二个DeepSeek”的比喻未必准确,但Manus再次证明,中国AI公司并非单纯追随西方技术路径。他们正以独特方式推动自主智能体的落地——不局限于底层模型创新,更聚焦于实际应用场景的深耕。

原文网址:https://www.technologyreview.com/2025/03/11/1113133/manus-ai-review/

特别说明:本文内容选自MIT Technology Review官网,仅供学习交流使用,如有侵权请后台联系小编删除。

Was it helpful ?

发表评论

您的邮箱地址不会被公开。 必填项已用 * 标注