A couple of weeks ago I attended the Lisbon Machine Learning School. I heard many excellent talks about deep learning models, which can answer questions, summarize articles, translate texts and do many other amazing things. At the same time, it made me feel uneasy. When AI is so fluent and impressive, the user can believe that AI is omniscient and wise. And this is very dangerous.
First of all, AI is only as wise and objective as the data it is trained on. The problem of AI bias is well known. For example, automatic speech recognition systems make more errors for black speakers than white speakers, probably due to the lack of sufficient data from black speakers. Music recommender systems do not recommend female artists as often as their male colleagues. When users search for “job openings near me” in a search engine, they see male-dominant job outcomes, which may result in a larger attractiveness or social value for such jobs. This means that AI not only reflects the already existing social biases, but also amplifies them.
Moreover, even if the training data are correct and unbiased, there is no guarantee that the model will provide a correct answer to a question or will summarize a text correctly. It is known that neural models sometimes “hallucinate”. For example, for the source sentence
prince george could be days away of becoming an older brother as the duchess is due to give birth to her second child mid-to-late april
a model produced the following summary sentence:
prince george is due to give birth to her second child mid-to-late april.
This is obviously incorrect. Actually, about 25% of summaries studied by Falke et al. contain hallucinations of different kinds. This is quite a lot!
One of the causes of mistakes is incorrect “understanding” which person or thing are meant by pronouns like she, he or it. Slav Petrov from Google Research gave a nice illustration of this problem in his talk. If you google, “how old is ben gomes”, you will get the response, “77 years old”.
This incorrect answer comes from a BBC article, where the pronoun “he” refers to the President of India. Note also that the article was published in 2013, but the model ignores this fact.
Language models can also add personal opinions, experiences, feelings, internal assessments of reality that cannot be traced anywhere in the source document. For example, Hannah Rashkin and her colleagues from Google Research provide this source text:
Louis XIII’s successor, Louis XIV, had a great interest in Versailles. He settled on the royal hunting lodge at Versailles, and over the following decades had it expanded into one of the largest palaces in the world. Beginning in 1661, the architect Louis Le Vau, landscape architect André Le Nôtre, and painter-decorator Charles Lebrun began a detailed renovation and expansion of the château. This was done to fulfill Louis XIV’s desire to establish a new centre for the royal court. Following the Treaties of Nijmegen in 1678, he began to gradually move the court to Versailles. The court was officially established there on 6 May 1682.
When a user asks a system, “who expanded the palace of versailles to its present size“, the system gives an unexpected answer: “louis xiv, his successor, was a very good person”. Indeed, it was Louis XIV who expanded the palace, but the source document does not tell us anything about his personality. This is an opinion which is not supported by the source document.
Unfortunately, is not easy to evaluate how often AI hallucinates. One metric proposed by Rashkin and her colleagues is called “Attributable to Identified Sources”. They formulate it as follows: A standalone proposition X (that is, a sentence that can be interpreted without context) is attributable to parts of some underlying corpus iff a generic hearer will, with a chosen level of confidence, affirm the following statement: “According to these corpus parts, the proposition X holds”. Obviously, getting human evaluations is very time-consuming. Another problem is that even human annotators sometimes disagree about whether a sentence is attributable to the source or not.
What can we do? In the ideal world, every answer produced by AI should have labels that reflect its faithfulness and potential bias in the data. The media should regularly inform people about the errors and shortcomings of AI, rather than cheer its latest advances and gimmicks. It might also be a good idea to drop the term Artificial Intelligence and speak instead about Imitated, or Simulated Intelligence, in order to highlight the fact that the models are taught to imitate humans’ behaviour, not to think in a human-like way. In my view, this understanding is crucial if we want to maintain a healthy relationship with the technology.
2 thoughts on “When AI hallucinates”
Excellent commentary on the nature of hallucinations by “simulated intelligence”. To go beyond identifying the problem towards fixing it. it may be necessary to build populations of agents that have “lifelong” reputations for trust, or lack thereof. An agent that can survive with its trust ratings intact (above a certain threshold) is the one we want to hear from.
That’s a beautiful idea! Competing agents sound great! I really hope that the big tech will be interested in such experiments (or forced by the society). Unfortunately, ordinary researchers no longer have resources for state-of-the-art language models, with only a couple of exceptions like BLOOM.