How Microsoft’s AI chatbot ‘hallucinates’ election information
The upcoming year will be a busy one for democracy. Major elections will take place in the US, the EU, and Taiwan, among others. Just as the rollout of generative AI era is gathering pace in earnest — something some fear could prove detrimental to the democratic process.
One of the main concerns about generative AI is that it could be used for the malicious spread of disinformation, or that models can make up false statements presenting them as fact, so-called hallucinations. A study from two European NGOs has found that Microsoft’s chatbot — Bing AI (running on OpenAI’s GPT-4) — gave incorrect answers to one third of questions about elections in Germany and Switzerland.
The study was conducted by Algorithm Watch and AI Forensics. The organisations said they prompted the chatbot with questions relating to candidates, polling and voting information, as well as more open recommendation requests on who to vote for when concerned with specific subjects, such as the environment.
“Our research shows that malicious actors are not the only source of misinformation; general-purpose chatbots can be just as threatening to the information ecosystem,” commented Salvatore Romano, Senior Researcher at AI Forensics. “Microsoft should acknowledge this, and recognize that flagging the generative AI content made by others is not enough. Their tools, even when implicating trustworthy sources, produce incorrect information at scale.”
Microsoft’s chatbot attributed false information to sources
According to the study errors included “wrong election dates, outdated candidates, or even invented controversies concerning candidates.” Moreover, the incorrect information was often attributed to a source that had the correct information on the topic. It also made up “stories about candidates being involved in scandalous behaviour,” also attributing the information to reputable sources.
At times, Microsoft’s AI chatbot evaded answering questions it did not know the answers to. However, rather than not respond, often it simply made up an answer — including fabricating allegations about corruption.
The samples for the study were collected from 21 August 2023 to 2 October 2023. When the authors informed Microsoft of the results, the tech giant said it would look into fixing the matter. However, one month on, new samples yielded similar results.
Microsoft’s press office was not available for a comment ahead of the publication of this article. However, a spokesperson for the company told the Wall Street Journal that Microsoft was “continuing to address issues and prepare our tools to perform to our expectations for the 2024 elections.” Meanwhile, they also urged users to apply their “best judgement” while reviewing Microsoft AI chatbot results.
“It’s time we discredit referring to these mistakes as ‘hallucinations’,” said Riccardo Angius, Applied Math Lead and Researcher AI Forensics. “Our research exposes the much more intricate and structural occurrence of misleading factual errors in general-purpose LLMs and chatbots.”
You can find the study in its entirety here.