AI Chatbots Get Nearly Half of News Content Wrong, Study Finds
The study measured each AI’s ability to provide context, cite sources, editorialize responsibly, and distinguish fact from opinion.
Topics
News
- Meta’s Chief AI Mind Yann LeCun to Exit, Build Own AI Lab
- Meta Expands AI’s Vocabulary to 1,600 Languages
- Google Bridges Cloud Power and Privacy with Private AI Compute
- California’s Genspark Enters AI Unicorn Club with $200 Million Round
- OpenAI Taps Intel’s Sachin Katti to Build Compute Backbone for AGI
- Flipkart Brings Voice-Led Wholesale Ordering to WhatsApp with Sarvam AI
According to a new study, four of the most popular AI assistants misrepresent news content in nearly 50% of their responses, irrespective of language or region.
The research, coordinated by the European Broadcasting Union (EBU) and involving 22 public service media organizations, including the BBC (UK), NPR (US), and DW (Germany), examined 3,000 answers from OpanAI’s ChatGPT, Microsoft’s Copilot, Google’s Gemini, and Perplexity AI.
The findings were troubling: 45% of the AI-generated responses had at least one major issue, 31% had serious sourcing problems, and 20% contained factual inaccuracies. DW’s own tests found 53% of responses to its questions had significant flaws, including basic factual errors.
The study measured each AI’s ability to provide context, cite sources, editorialize responsibly, and distinguish fact from opinion.
“These failings are not isolated incidents. They are systemic, cross-border, and multilingual. When people don’t know what to trust, they end up trusting nothing, and that threatens democratic participation,” said Jean Philip De Tender, deputy director general of the EBU.
According to the Reuters Institute’s Digital News Report 2025, around 7% of online news consumers now use AI chatbots to get their news, a number that rises to 15% among those under 25.
The latest results build on a BBC-led investigation from February 2025, which found more than half of AI-generated news summaries contained significant issues. While the new research shows slight improvements, the overall accuracy rate remains low.
Among the four tested systems, Google’s Gemini performed the worst, with 72% of its responses showing sourcing problems. Microsoft’s Copilot and Gemini were also flagged as the poorest performers in the earlier BBC study.
BBC’s program director of generative AI, Peter Archer, said, “We’re excited about AI and how it can help us bring more value to audiences. But people must be able to trust what they read, watch and see. Despite some improvements, there are still significant issues.”
The EBU and participating media outlets have urged governments and AI firms to act. They are calling for regulators to enforce existing laws on information integrity and media pluralism, and for independent monitoring of AI-generated content.
To push for stronger accountability, the EBU has launched a joint campaign with several international broadcasters, “Facts In: Facts Out”, demanding that AI systems treat journalistic content responsibly.
“When these systems distort or decontextualize trusted news, they undermine public trust. If facts go in, facts must come out,” the campaign organizers said.