By Vincent Chow
Copyright scmp
DeepSeek’s landmark peer-reviewed article featured in the British journal Nature could encourage other Chinese artificial intelligence companies to submit their work to major scientific publications, according to AI experts.
Hangzhou-based DeepSeek set a “very good example” for future AI model releases from both Chinese and US firms, according to Huan Sun, an associate professor of computer science and engineering at Ohio State University.
“I hope frontier model developers could follow suit and go beyond just releasing ‘technical reports’ or ‘system cards’ with few details,” she said.
DeepSeek’s article in Nature, published last week, revealed details about the risks faced by its AI models for the first time, noting that open-sourced models were particularly susceptible to being “jailbroken” by malicious actors.
Before its publication in Nature, DeepSeek’s manuscript went through several rounds of feedback from eight reviewers, made up of respected academics and researchers. It was first submitted to the journal on February 14, weeks after the Chinese start-up released its R1 reasoning model to wide industry acclaim.
DeepSeek’s decision to have its work peer-reviewed by a renowned journal reflected the company’s confidence in its AI development, serving as a guidepost for other AI firms on the mainland and overseas.
Leading AI firms, including DeepSeek and ChatGPT creator OpenAI, have typically released technical reports to outline their capabilities and AI model training process via open-access repositories such as Cornell University’s arXiv.
These reports, however, have not been peer reviewed, which has led to criticism among AI researchers, particularly those in academia, for their lack of scientific rigour.
There could be more submissions by Chinese tech firms to world-leading journals in the future, on the back of the standard set by DeepSeek, according to Zhu Xiaohu, a Shanghai-based AI safety researcher and the founder of the Center for Safe AGI.
“I think it will become a template, a prototype, the same way its R1 release had the same effect on the industry,” Zhu said.
In an editorial titled “Bring us your LLMs: why peer review is good for AI models”, Nature’s editorial board praised DeepSeek for having its core claims about the R1 model peer reviewed, pointing out that some of the risks AI models face could be mitigated through that process.
“None of the most widely used large language models (LLMs) that are rapidly upending how humanity is acquiring knowledge has faced independent peer review in a research journal,” the editorial board said. “Peer-reviewed publication aids clarity about how LLMs work and helps to assess whether they do what they purport to do.”
The board singled out the typical issues of AI developers in “marking their own homework” by choosing benchmarks that show their models in the best light, as well as the lack of information about the risks of new models.
The interactions between expert reviewers and the DeepSeek team, published in full by Nature, showed that the Chinese firm provided details on those two counts.
That included information about how DeepSeek prevented its benchmark scores from being “contaminated” – in which the model simply regurgitated training data – and how its models are evaluated for the risks they may pose.
“All of this is a welcome step towards transparency and reproducibility in an industry in which unverified claims and hype are all too often the norm,” the editorial board said.
DeepSeek’s research team is primarily composed of graduates and PhD students from leading domestic universities. Founder and CEO Liang Wenfeng, listed as one of the authors of the Nature article, graduated from the elite Zhejiang University with a master’s degree in information and communication engineering.
Another author was Yu Wu, head of LLM alignment at DeepSeek, who completed his PhD in computer science at Beihang University in Beijing. He was responsible for corresponding with the expert reviewers.