Culture

Amazon Q Fell ‘Significantly’ Behind Rivals on Accuracy in First Year

Amazon Q Fell 'Significantly' Behind Rivals on Accuracy in First Year

Amazon’s AI productivity tool, Q Business, struggled with accuracy in its first year, showing “mixed success” and falling “significantly” behind competitors in key features, according to an internal document seen by Business Insider.
The document, from March, said Q Business struggled to process tabular and spreadsheet data, drawing complaints from customers, including Accenture, Intuit, and Smartsheet. It also noted difficulties with longer responses and conversational flow.
“We face challenges with non-text data (embedded tables and spreadsheets), accuracy evaluation methods, and currently lag behind competitors in conversational experience that customers seem to be delighted with,” the document said.
Q Business, Amazon’s flagship AI assistant for corporate users, debuted at AWS’s 2023 re:Invent conference. The product faced early challenges that some employees attributed to a “rushed” launch, and others have since warned it risks losing customers over subpar features, Business Insider previously reported.
The March document reveals that Amazon continued to struggle with accuracy throughout Q Business’s first year, underscoring the challenges of launching a business-focused AI productivity tool. Now, Amazon plans to launch a new agentic AI product called Quick that merges Q Business with other AWS products, Business Insider previously reported.
The document also reflects Amazon’s strong writing culture, which encourages employees to surface concerns and address customer complaints proactively. Indeed, Amazon’s spokesperson told Business Insider that the March document is “outdated” and the accuracy issues have since been fixed in updates to the service.
“Our culture demands that we remain vocally self-critical as we innovate rapidly for customers,” the spokesperson said.
Connectors and ‘incomplete’ responses
AI systems often struggle with accuracy. Reports of incorrect or made-up answers, known as “hallucinations,” are widespread.
Related stories
Business Insider tells the innovative stories you want to know
Business Insider tells the innovative stories you want to know
For Q Business, a major source of the problems was connectors, the systems that bridge AI tools with outside data sources and applications.
According to the document, Amazon’s Q Business struggled to reliably retrieve the information users requested, leading to incorrect answers. Intuit, for instance, could not access their custom metadata to improve document relevance. Accenture reported problems analyzing architecture diagrams. Asana, meanwhile, retrieved irrelevant documents in searches.
The larger issue, the document noted, was Q Business’s conversational ability. Its dialogue capabilities “significantly” trailed rivals such as Perplexity, which provides richer context and deeper insights. Q Business frequently returned “incomplete” responses because it could not pull longer sections from documents or maintain a consistent memory of the conversation, the document explained.
Another challenge was staffing within the accuracy team, the document added. The team saw at least 6 product manager changes last year, and the engineering and data teams lacked “adequate resourcing” for accuracy work, it stated.
Formal accuracy program
To address these challenges, Amazon created a formal accuracy program in February, according to the document said.
Since then, the company has rolled out a series of updates. In April, Q Business introduced a “hallucination mitigation” feature, followed in July by a response customization tool designed to deliver more consistent communication. In August, the company added an agentic retrieval-augmented generation system designed to produce more accurate and comprehensive answers.
“The result is an improved chat experience and a more capable query answering engine that maximizes the value of your data assets,” Amazon said in a blog post about the launch of the new agentic RAG feature.
An Amazon spokesperson said several Q Business customers, including Nasdaq, Jabil, and Availity, have publicly shared positive feedback.
Nasdaq reported using the tool to quickly build AI applications with simple clicks and data connections, which helped improve its regulatory compliance reviews.
Jabil created an internal “Ask Me How” tool through Q Business that cut downtime by enabling operators to resolve issues on their own without waiting for technicians.
The March document also noted that Q Business achieved 90% accuracy for text-rich data, while it accelerated the time to address customer complaints.
Inside AWS, doubts persist about Q’s future, according to some employees. Staff previously told Business Insider the company lacks a strong record with business applications, arguing its expertise lies more in cloud infrastructure than customer-facing software.
Amazon’s other AI offerings, including its Q Developer coding assistant, have lagged rivals in revenue, Business Insider previously reported. The company is now rethinking its sales strategy here with a more grassroots approach.
An AWS spokesperson pushed back on that view, saying it was “not correct” to claim AWS hasn’t found success beyond infrastructure, citing Bedrock, Connect, and SageMaker as examples.
“We are the top leader, or leader of leaders, on all calibrations of measurement in hundreds of third-party evaluations each year, and no one else is even close,” the spokesperson said.