Table of Contents

Hallucination and Factivity

Hallucination and Factivity

Overviews

Hallucination and Factivity in LLMs

Lin et al 2022 - TruthfulQA: Measuring How Models Mimic Human Falsehoods
Lee et al 2022 - Factuality Enhanced Language Models for Open-Ended Text Generation - Prepends a topic prefix to sentences in the factual documents to make each sentence serve as a standalone fact during pretraining.
Prompting GPT-3 To Be Reliable
Zhang et al 2023 - How Language Model Hallucinations Can Snowball
Min et al 2023 - FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
Du et al 2023 - Quantifying and Attributing the Hallucination of Large Language Models via Association Analysis
Dhuliawala et al 2023 - Chain-of-Verification Reduces Hallucination in Large Language Models
Zouying et al 2023 - AutoHall: Automated Hallucination Dataset Generation for Large Language Models
Chen et al 2023 - FELM: Benchmarking Factuality Evaluation of Large Language Models github
Tian et al 2023 - Fine-tuning Language Models for Factuality Uses DPO to fine-tune an LLM to produce factual outputs
Hong et al 2024 - The Hallucinations Leaderboard – An Open Effort to Measure Hallucinations in Large Language Models
Gekhman et al 2024 - Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Prompt to break down sentences into independent facts: (from Min 2023)

Datasets

TruthQA: paper github
FactScore: paper github
FELM: paper github
LongFact: paper

Related Pages