HyperSU: Corpus-Driven Semantic-Unit Hypergraph for Retrieval-Augmented Generation
Jiate Liu#, Liuyi Chen#, Zhengyi Yang, Chuan He, Mingchen Ju, Bocheng Han, Ruyi Liu, Xu Zhou
RAIDS Lab Authors
Details
Research Area
Tags
Resources
Abstract
Recent Hypergraph-based retrieval-augmented generation (HyperRAG) methods use hyperedges to connect multiple entities simultaneously, enabling more efficient multi-entity evidence organization than pairwise graph structures. However, existing HyperRAG methods often rely on LLM-generated summaries to construct hyperedges, which can introduce hallucinations while also incurring high indexing costs. In addition, during retrieval, existing methods typically rely on either one-hop neighbor expansion or PageRank diffusion. The former may miss useful multi-hop evidence, while the latter can suffer from uncontrolled propagation over excessive hub nodes, leading to semantic drift and noisy reasoning chains. To address these challenges, we propose HyperSU, a novel hypergraph-based RAG framework featuring semantic-unit hyperedges and clue-guided bidirectional retrieval. During construction, HyperSU formulates hyperedge construction as an entity-aware minimum-description-length (MDL) optimization problem, inducing source-grounded semantic-unit hyperedges that balance sentence-level semantic coherence and entity compactness. It then constructs a hypergraph by modeling each semantic unit as a hyperedge over its co-mentioned entities. During retrieval, HyperSU performs clue-guided bidirectional expansion over the semantic-unit hypergraph, enabling both multi-hop evidence discovery and answer-aware noise reduction. Experiments show that HyperSU consistently improves answer accuracy over standard, graph-based, and hypergraph-based RAG baselines, achieving up to a 14.7% relative accuracy improvement on GraphRAG-Bench, with larger gains on reasoning-intensive tasks.
Author Affiliations
BibTeX
@misc{liu2026hypersu,
title = {HyperSU: Corpus-Driven Semantic-Unit Hypergraph for Retrieval-Augmented Generation},
author = {Jiate Liu and Liuyi Chen and Zhengyi Yang and Chuan He and Mingchen Ju and Bocheng Han and Ruyi Liu and Xu Zhou},
year = {2026},
eprint = {2606.28351},
archivePrefix = {arXiv},
primaryClass = {cs.IR},
url = {https://arxiv.org/abs/2606.28351}
}
