Authors: Sihao Chen, Fan Zhang, Kazoo Sone, Dan Roth
Paper reference: https://aclanthology.org/2021.naacl-main.475.pdf
Contribution
This paper studies contrast candidate generation and selection as a post-processing method to correct extrinsic hallucinations on entities and quantities in abstractive summarization. Specifically, in the generation step, candidate summaries are created by replacing potentially hallucinated entities in a summary by ones with compatible semantic types that are present in the source. In the selection step, all variants of summaries are scored and the one with highest score will be the final summary. The corrected summaries present statistically significant improvements over the original ones.
The work points out some limitations of the proposed method. Replacing entities potentially introduces intrinsic hallucinations in the changed summary. Further, it is not sufficient to fully detect all hallucinations by only focusing on entities and quantities, and it might require commonsense reasoning and knowledge retrieval.
Details
Contrast Candidate Generation
The technique in this paper is based on the observation that a large fraction of extrinsic hallucinations is from named entities and quantities.
Identify potentially hallucinated entities
Potentially hallucinated entities are identified by checking whether entities with similar surface forms have appeared in the source document using an NER system.
Contrast Candidate Selection
This paper trains a classifier (BART + a linear layer) to score and rank the summariy variants. The classifier is trained with ground truth and synthetic negative summaries. The candidate with the highest score will be the final summary.