This report addresses the question why large language models (LLM) hallucinate and generate inaccurate responses and how to teach them to do it less.

When large language models generate output, they can sometimes produce seemingly plausible but factually inaccurate or made-up responses known as hallucinations. These hallucinations can happen when asking LLM to work on any type of task, such as math and arithmetic tasks, producing citations or in more creative projects, such as writing ads for products. While LLM hallucinations with imaginative tasks, like script writing, can even be viewed as an advantage, such false output presents a serious problem for tasks relying on the accuracy of model responses, for example, research, clinical diagnosing, consulting, or coding.

There are multiple strategies employed to combat LLM hallucinations, spanning prompting, fine-tuning, retrieval augmented generation and custom-designed approaches targeting specific applications, like SynTra. Each of these methods can help to reduce the probability of AI hallucinations, while their efficiency depends on the type of hallucination they addressed, resources available and other factors.

Below, we go into more detail about what causes large language models to hallucinate, review various types of hallucinations and outline strategies which can be used to combat them.

Why Do LLM Hallucinations Exist?

At the moment, there are several theories as to why LLMs generate hallucinations across various tasks. Some researchers, like Huang et al. (2023), consider that one of the main reasons behind hallucinations is LLMs' current inability to identify inaccuracies in their output and self-correct based on their own capabilities without external feedback.

Stemming from this conclusion, these researchers underscore the need for a multi-faceted approach when deploying neural models in real-world applications and employing mitigating techniques to address existing limitations. Meanwhile, other studies basically echo the same sentiment, suggesting individual approaches to address different types of hallucinations.

Types and Examples of LLM Hallucinations

There are multiple taxonomies for classifying LLM hallucinations by the type of task where these hallucinations occur, the type of errors made by LLMs, and more. Meanwhile, as was demonstrated by Zhang et al. [2023], most hallucinations happening when using LLMs for various tasks can be classified into input-conflicting, context-conflicting and fact-conflicting hallucinations, as shown in a simplified example below.

Figure 1: Illustration of three types of most frequent LLM hallucinations. Source: Zhang et al. (2023)

Input-Conflicting Hallucinations

According to the suggested classification, input-conflicting hallucinations occur when the output produced by the LLM deviates from the input given in the prompt. Input-conflicting hallucinations occur when an LLM misinterprets user intent.

Context-Conflicting Hallucinations

In case of context-conflicting hallucinations, LLM delivers an output which contradicts its previous responses. Such type of hallucinations happen when LLM is tasked with generating multi-turn or lengthy responses. An LLM can demonstrate such type of hallucinations when the model "forgets" the context or can no longer maintain consistency throughout the dialog, for example, due to the limitation of its context length.

Fact-Conflicting Hallucinations

In fact-conflicting hallucinations, LLM produces responses that contradict established knowledge. For the moment, these types of hallucinations present the most serious challenge for researchers and the practical use of AI in real-world applications.

Strategies to Limit LLM Hallucinations

Current approaches to limiting LLM hallucinations revolve around eliciting neural networks in human-like reasoning through applications of various methods, including prompting, data augmentation and creating self-correcting algorithms.

Using humans or LLMs to detect and correct hallucinations

One of the most direct and straightforward methods to correct hallucinations is doing it manually by human users. Another strategy would be to use the second LLM as an evaluator of the first LLM responses. While both methods can be used in certain settings, they are generally considered slow, expensive and error-prone.

Prompt engineering for minimizing hallucinations

Prompt engineering allows for minimizing LLM hallucination by inducing neural networks into human-like reasoning. As demonstrated in a study by Touvron et al. 2023 in paper LLama 2: Open Foundation and Fine-Tuned Chat Models, prompting an LLM "If you don't know the answer to the question, please don't share false information" can prevent the model from providing inaccurate answers and reduce the number of fact-conflicting hallucinations.

Meanwhile, more complex prompting techniques such as Chain-of-Thought or Tree-of-Thought can further increase the accuracy of LLM responses by inducing the model into thinking step-by-step and providing examples of actual reasoning for the model to follow.

Fine-tuning

Fine-tuning can be one of the most effective approaches in reducing hallucinations in cases where there is enough training data to provide for a supervised learning process for a large language model. As demonstrated by Wei et al. [2022], fine-tuning allows to increase the accuracy of LLM responses compared to prompt engineering on a mixture of different tasks.

Data augmentation

Combining LLM knowledge with a retrieval system to find relevant external data allows the model to improve the relevancy of its responses.

In particular, methods like Retrieval Augmented Generation (RAG) allow to minimize LLM hallucinations due to erroneous entities in the model knowledge base, for example, information about names, dates, locations and other facts. In addition, RAG helps to exclude so-called outdatedness hallucinations resulting from LLM using non-current data.

Other methods

At the moment, various teams work on implementing custom methods aimed to reduce LLM hallucinations on different projects, including translations, Q&A, dialogs, and mathematical or symbolic thinking, which utilize various approaches most suitable to the task at hand.

For example, in Jones et al. [2023], Teaching Language Models to Hallucinate Less with Synthetic Tasks, the researchers offer a method to minimize LLM hallucination on abstractive summarization tasks, such as answering questions based on provided documentation, preparing summaries for the meetings and generating clinical reports.

The method, referred to as SynTra, includes synthetic tasks to deliberately elicit and measure hallucinations, which is followed by optimizing LLM's system message and transferring this message into practical assignments.

Reduce AI Hallucinations Faster with VectorShift

Minimizing LLM hallucinations requires a multi-faceted approach and can involve various techniques, from prompt engineering and fine-tuning to custom-designed methods addressing specific tasks. Meanwhile, creating a framework for reducing AI hallucinations in practical applications can be complicated and require extensive experimentation and testing.

Introducing AI into your applications can be a much more streamlined process with functionality provided by VectorShift, offering SDK interfaces and no-code capabilities. For more information on how to leverage AI technology and reduce LLM hallucinations, please don't hesitate to get in touch with our team or request a free demo.