By Vincent Chow
Copyright scmp
Alibaba Group Holding has unveiled a “leading open-source deep research” artificial intelligence agent that it says matches the performance of OpenAI’s flagship Deep Research tool, while being more efficient.
The agent has been integrated into Alibaba’s maps app, Amap, and its AI-powered legal research tool, Tongyi FaRui, according to a blog post on Tuesday by Alibaba’s AI search development team, Tongyi Lab. Alibaba owns the South China Morning Post.
Users of Amap can leverage the deep research agent’s web retrieval capabilities to plan multi-day trips. Meanwhile, Tongyi FaRui has been updated with the agent’s research functions, enhancing its ability to retrieve case law with verified citations, according to Alibaba.
The agent is the latest addition to Alibaba’s rapidly expanding AI initiatives. In the past two weeks alone, the company has launched its first trillion-parameter base model, Qwen-3-Max-Preview, along with Qwen3-Next-80B-A3B – a smaller yet more powerful model, according to benchmarking firm Artificial Analysis.
Deep research agents are AI tools designed to perform complex web retrieval tasks that require multiple steps. The first such agent, OpenAI’s Deep Research, was launched and integrated into ChatGPT in February. Other major US tech companies, including Google DeepMind, have also introduced similar tools.
Alibaba said its deep research agent showed “incredible efficiency” compared with US proprietary tools, as it had only 30 billion parameters – significantly fewer than the estimated parameter counts of the models driving US deep research agents.
Parameters are the variables that encode an AI model’s “intelligence” and are adjusted during the training process. Generally, a higher number of parameters indicates a more powerful model, but it also requires more computational resources to train and operate.
A graphic released by Alibaba showed that its new agent achieved industry-leading scores across various advanced benchmarks, including Humanity’s Last Exam – a challenging set of academic questions known to test the limits of existing AI systems.
Alibaba said its agent scored 32.9 per cent on this benchmark, surpassing OpenAI’s Deep Research’s 26.6 per cent, although the latter’s score was obtained earlier this year.
Adina Yakefu, a machine learning community manager at open-source platform Hugging Face, described Alibaba’s self-reported benchmark scores as “amazing”. After it was open-sourced, the agent quickly gained traction on the platform, enabling developers worldwide to download and build upon it.
The strength and efficiency of Alibaba’s agent stemmed from its innovative data curation pipeline, which produced “very high-quality” synthetic training data, said Tan Sijun, an AI researcher at the Sky Computing Lab of University of California, Berkeley.
Synthetic training data is generated by AI systems rather than sourced from the real world. As real-world data becomes increasingly scarce, AI companies are turning to synthetic data to train new systems.
Alibaba said that its synthetic data solution was applied throughout the entire training pipeline and incorporated a new technique that enabled a “data flywheel”, where data generated during training was reused to enhance the model without human intervention.
That “approach ensures both exceptional data quality and massive scalability, breaking through the upper limits of model capabilities”, its developers wrote, noting that this training pipeline had yet to be validated on base models significantly larger than 30 billion parameters.
The company also said the agent’s context length of 128,000 tokens remained a limitation for many complex research tasks that required long inputs.