Researchers from MIT and Microsoft Introduce DoLa: A Novel AI Decoding Strategy Aimed at Reducing Hallucinations in LLMs

Numerous natural language processing (NLP) applications have benefited greatly from using large language models (LLMs). While LLMs have improved in performance and gained additional capabilities due to being scaled, they still have a problem with “hallucinating” or producing information inconsistent with the real-world facts detected during pre-training. This represents a significant barrier to adoption for high-stakes applications (such as those found in clinical and legal settings), where the generation of trustworthy text is essential.

The maximum likelihood language modeling target, which seeks to minimize the forward KL divergence between the data and model distributions, may be to blame for LMs’ hallucinations. However, this is far from certain. The LM may assign a non-zero probability to phrases that are not fully consistent with the knowledge encoded in the training data if this goal is pursued.

From the perspective of the interpretability of the model, studies have shown that the earlier layers of transformer LMs encode “lower level” information (such as part-of-speech tags). In contrast, the later levels encode more “semantic” information. 

A group of researchers at MIT and Microsoft suggest using this modular encoding of knowledge to increase the LM’s factual knowledge via a contrastive decoding strategy, where the likelihood of the next word’s output is calculated using the difference in logits from a higher layer. With this, it is possible to make LMs more grounded in reality and cut down on hallucinations by prioritizing information from deeper levels and downplaying that from intermediate or shallower ones.

Their recent work introduces Decoding by Contrasting Layers (DoLa), a novel decoding approach. The proposed method is based on improving the exposure of factual knowledge encoded in an LLM without retrieving external knowledge or doing further fine-tuning. 

DoLa has been shown experimentally to improve the integrity of LLaMA family models on both TruthfulQA and FACTOR. For both StrategyQA and GSM8K cc, additional experiments on chain-of-thought reasoning demonstrate its potential to improve factual reasoning. Finally, experimental results on open-ended text production (evaluated with GPT-4) reveal that DoLa can generate informative and significantly more factual responses that lead to superior ratings compared to the original decoding approach. DoLa is a decoding approach that can be used to increase the honesty of LLMs, and findings show that it adds only a small amount of time to the decoding process.

The researchers did not investigate the model’s performance in other domains, such as following instructions or picking up on human feedback. In addition, rather than leveraging human labels or factual information sources for fine-tuning, the team relies on preexisting architecture and parameters, restricting the scope of possible enhancements. Unlike certain retrieval-augmented LMs, this technique depends entirely on the model’s preexisting knowledge rather than adding new information through external retrieval modules. The team hopes future work incorporates the components above with their decoding technique to help overcome the restrictions.


Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 30k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..


Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone’s life easy.


🚀 The end of project management by humans (Sponsored)

Credit: Source link

Comments are closed.