Efficient and Scalable Distributed Graph Structural Clustering at Billion Scale
Kongzhang Hao, Long Yuan, Zhengyi Yang, Wenjie Zhang, Xuemin Lin
International Conference on Database Systems for Advanced Applications (DASFAA)
RAIDS Lab Authors
Details
Research Area
Tags
Resources
Abstract
Structural Graph Clustering (SCAN) is a fundamental problem in graph analysis and has received considerable attention recently. Existing distributed solutions either lack efficiency or suffer from high memory consumption when addressing this problem in billion-scale graphs. Motivated by these, in this paper, we aim to devise a distributed algorithm for SCAN that is both efficient and scalable. We first propose a fine-grained clustering framework tailored for SCAN. Based on the new framework, we devise a distributed SCAN algorithm, which not only keeps a low communication overhead during execution, but also effectively reduces the memory consumption at all time. We also devise an effective workload balance mechanism that is automatically triggered by the idle machines to handle skewed workloads. The experiment results demonstrate the efficiency and scalability of our proposed algorithm.
Author Affiliations
BibTeX
@inproceedings{hao2023efficient,
title = {Efficient and Scalable Distributed Graph Structural Clustering at Billion Scale},
author = {Hao, Kongzhang and Yuan, Long and Yang, Zhengyi and Zhang, Wenjie and Lin, Xuemin},
editor = {Wang, Xin and Sapino, Maria Luisa and Han, Wook-Shin and El Abbadi, Amr and Dobbie, Gill and Feng, Zhiyong and Shao, Yingxiao and Yin, Hongzhi},
booktitle = {Database Systems for Advanced Applications},
year = {2023},
publisher = {Springer Nature Switzerland},
address = {Cham},
pages = {234--251},
isbn = {978-3-031-30675-4}
}
