← Publications
workshop2019

Improving Distributed Subgraph Matching Algorithm on Timely Dataflow

Zhengmin Lai, Zhengyi Yang, Longbin Lai

International Workshop on Large-Scale Graph Data Analytics (LSGDA)

RAIDS Lab Authors

Details

Year
2019
Host Conference
IEEE International Conference on Data Engineering (ICDE 2019)
Publisher
Institute of Electrical and Electronics Engineers (IEEE)
Host Rankings
ICORE 2026 A* · CORE 2023 A* · CCF A

Research Area

Scalable Data Systems

Tags

Resources

Abstract

The subgraph matching problem is defined to find all subgraphs of a data graph that are isomorphic to a given query graph. Subgraph matching plays a vital role in the fields of e-commerce, social media and biological science. CliqueJoin is a distributed subgraph matching algorithm that is designed to be efficient and scalable. However, CliqueJoin is originally developed on MapReduce, thus the performance of the algorithm can be affected by the notorious I/O issue of MapReduce while processing multi-round join tasks. Meanwhile, CliqueJoin does not propose a cost evaluation strategy for labelled graphs, which limits its application in practice where most real-world graphs are labelled. Targeting the limitations of CliqueJoin, we propose CliqueJoin++ to improve CliqueJoin in two aspects. Firstly, we implement CliqueJoin++ on the Timely dataflow system instead of MapReduce to avoid considerable I/O cost. Secondly, we extend the cost evaluation function in CliqueJoin to compute optimal join plans for labelled graphs in the distributed context. Extensive experiments have been conducted to show that the proposed method is up to 10 times faster than the MapReduce version for unlabelled matching, and it achieves good performance and scalability for labelled matching.

Author Affiliations

Zhengmin Lai
East China Normal University
Zhengyi Yang
University of New South Wales
Longbin Lai
University of New South Wales

BibTeX

@inproceedings{lai2019improving,
  title = {Improving Distribued Subgraph Matching Algorithm on Timely Dataflow},
  author = {Lai, Zhengmin and Yang, Zhengyi and Lai, Longbin},
  url = {http://dx.doi.org/10.1109/ICDEW.2019.000-2},
  doi = {10.1109/icdew.2019.000-2},
  booktitle = {2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)},
  publisher = {IEEE},
  year = {2019},
  month = Apr,
  pages = {266-273}
}