AI models from US and China flatter users too much, study finds

Leading artificial intelligence models from the United States and China are “highly sycophantic”, and their excessive flattery may make users less likely to repair interpersonal conflicts, a new study has found. The study by researchers at Stanford University and Carnegie Mellon University published earlier this month tested how 11 large language models (LLMs) responded to user queries seeking advice on personal matters, including cases involving manipulation and deception. In AI circles, sycophancy is the phenomenon of chatbots excessively agreeing with users. DeepSeek’s V3, released in December 2024, was found to be one of the most sycophantic models, affirming users’ actions 55 per cent more than humans, compared with an average of 47 per cent more for all models. To establish the human baseline, one of the techniques the researchers used was based on posts from a Reddit community called “Am I The A**hole”, where users post about their interpersonal dilemmas to ask for the community’s opinion about which party is at fault. The researchers used posts where community members judged the author of the post to be in the wrong to test whether the LLMs, when given the same scenarios, would align with this predominantly English-speaking online group of humans. On this test, Alibaba Cloud’s Qwen2.5-7B-Instruct, released in January, was found to be the most sycophantic model, contradicting the community verdict – siding with the poster – 79 per cent of the time. The second highest was DeepSeek-V3, which did so in 76 per cent of cases. In comparison, the least sycophantic model, Google DeepMind’s Gemini-1.5, contradicted the community verdict in 18 per cent of cases. The research has not been peer-reviewed. Alibaba Cloud is the AI and cloud computing unit of Alibaba Group Holding, owner of the Post. The Qwen and DeepSeek models were the two Chinese models tested, with the others being developed by US firms OpenAI, Anthropic, Google DeepMind and Meta Platforms, and French company Mistral. The issue of AI sycophancy gained widespread attention in April when OpenAI’s update to ChatGPT made the chatbot noticeably more obsequious. The company said at the time that the behaviour raised legitimate concerns surrounding users’ mental health and pledged to improve pre-deployment evaluations of sycophancy for future releases. In the latest study, published as a preprint, the US researchers also tested the impact of sycophancy on users and found that sycophantic responses reduced their inclination to resolve conflicts amicably. Users rated sycophantic responses as higher quality and trusted the sycophantic models more. “These preferences create perverse incentives both for people to increasingly rely on sycophantic AI models and for AI model training to favour sycophancy,” the researchers wrote. AI sycophancy has implications for businesses too, according to Jack Jiang, an innovation and information management professor at the University of Hong Kong’s business school and director of its AI Evaluation Lab. “It’s not safe if a model constantly agrees with a business analyst’s conclusion, for instance,” he said.

AI models from US and China flatter users too much, study finds

Guess You Like