A Cluster-Based Approach to kNN Join over Batch-Dynamic High-Dimensional Data
Nimish Ukey#, Guangjian Zhang#, Zhengyi Yang*, Xiaoyang Wang, Binghao Li, Serkan Saydam, Wenjie Zhang
International Conference on Advanced Data Mining and Applications (ADMA)
RAIDS Lab Authors
Details
Research Area
Tags
Resources
Awards
Abstract
The k nearest neighbors (kNN) join is a crucial operation in data mining, retrieving the kNN for each point in the query set within an answer set. This operation finds extensive applications in various domains, including recommendation systems, spatial databases, and knowledge discovery. With the surge in data volume and dimensionality, numerous approaches have emerged to enhance the efficiency of kNN join operations on static and dynamic high-dimensional data. However, we observed that research on batch-dynamic kNN join, where updates occur in batches rather than individually, remains scarce. To bridge this gap, we propose a novel cluster-based approach tailored for batch-dynamic kNN join over high-dimensional data. Our contributions include a cluster-based batch update technique, which efficiently processes similar updates in clusters, and a cluster-based pruning method using the high-dimensional R-tree (HDR-Tree) for optimised search during updates. Extensive experimental evaluations across 6 real-world datasets demonstrate the efficiency of our approach, significantly outperforming state-of-the-art methods by 19 to 55 times.
Author Affiliations
BibTeX
@inproceedings{ukey2024cluster,
title = {A Cluster-Based Approach to kNN Join Over Batch-Dynamic High-Dimensional Data},
author = {Ukey, Nimish and Zhang, Guangjian and Yang, Zhengyi and Wang, Xiaoyang and Li, Binghao and Saydam, Serkan and Zhang, Wenjie},
editor = {Sheng, Quan Z. and Dobbie, Gill and Jiang, Jing and Zhang, Xuyun and Zhang, Wei Emma and Manolopoulos, Yannis and Wu, Jia and Mansoor, Wathiq and Ma, Congbo},
booktitle = {Advanced Data Mining and Applications},
year = {2025},
publisher = {Springer Nature Singapore},
address = {Singapore},
pages = {81--96},
isbn = {978-981-96-0814-0}
}
