← Publications
conference2024ICORE 2026 CCORE 2023 CCCF C

A Cluster-Based Approach to kNN Join over Batch-Dynamic High-Dimensional Data

Nimish Ukey#, Guangjian Zhang#, Zhengyi Yang*, Xiaoyang Wang, Binghao Li, Serkan Saydam, Wenjie Zhang

International Conference on Advanced Data Mining and Applications (ADMA)

RAIDS Lab Authors

Details

Year
2024
Publisher
Springer
Rankings
ICORE 2026 C · CORE 2023 C · CCF C

Research Area

Scalable Data Systems

Tags

Resources

Awards

Best Student Paper Award

Abstract

The k nearest neighbors (kNN) join is a crucial operation in data mining, retrieving the kNN for each point in the query set within an answer set. This operation finds extensive applications in various domains, including recommendation systems, spatial databases, and knowledge discovery. With the surge in data volume and dimensionality, numerous approaches have emerged to enhance the efficiency of kNN join operations on static and dynamic high-dimensional data. However, we observed that research on batch-dynamic kNN join, where updates occur in batches rather than individually, remains scarce. To bridge this gap, we propose a novel cluster-based approach tailored for batch-dynamic kNN join over high-dimensional data. Our contributions include a cluster-based batch update technique, which efficiently processes similar updates in clusters, and a cluster-based pruning method using the high-dimensional R-tree (HDR-Tree) for optimised search during updates. Extensive experimental evaluations across 6 real-world datasets demonstrate the efficiency of our approach, significantly outperforming state-of-the-art methods by 19 to 55 times.

Author Affiliations

Nimish Ukey
University of New South Wales
Guangjian Zhang
University of New South Wales
Zhengyi Yang
University of New South Wales
Xiaoyang Wang
University of New South Wales
Binghao Li
University of New South Wales
Serkan Saydam
University of New South Wales
Wenjie Zhang
University of New South Wales

BibTeX

@inproceedings{ukey2024cluster,
  title = {A Cluster-Based Approach to kNN Join Over Batch-Dynamic High-Dimensional Data},
  author = {Ukey, Nimish and Zhang, Guangjian and Yang, Zhengyi and Wang, Xiaoyang and Li, Binghao and Saydam, Serkan and Zhang, Wenjie},
  editor = {Sheng, Quan Z. and Dobbie, Gill and Jiang, Jing and Zhang, Xuyun and Zhang, Wei Emma and Manolopoulos, Yannis and Wu, Jia and Mansoor, Wathiq and Ma, Congbo},
  booktitle = {Advanced Data Mining and Applications},
  year = {2025},
  publisher = {Springer Nature Singapore},
  address = {Singapore},
  pages = {81--96},
  isbn = {978-981-96-0814-0}
}