DiST: Efficient Distributed Spatio-Temporal Clustering With Automatic Parameter Optimization

Published in TKDE, 2025

Download paper here

Abstract: With the rapid advancements in positioning technologies, the volume of spatio-temporal data has grown significantly. Analyzing the spatial and temporal characteristics of these data is imperative for uncovering underlying associations and deriving insights into natural and societal mechanisms. Clustering is a widely utilized technique for data analysis, which groups data with similar characteristics for further investigation. However, current clustering methodologies usually inadequately address temporal properties that are vital in numerous scenarios. Additionally, traditional spatio-temporal clustering approaches are constrained to standalone environments, which struggle to handle large-scale spatio-temporal datasets. To this end, we introduce DiST, the first distributed spatio-temporal clustering method, which simultaneously considers both temporal and spatial proximity. DiST comprises data partition, local clustering, and global merging stages, along with an auto-tuning framework for parameter optimization. DiST addresses key challenges, including the integration of temporal and spatial attributes, managing data duplication across distributed nodes, and selecting appropriate parameters for diverse data characteristics. Comparative experiments on two real-world datasets validate the performance and scalability of DiST, demonstrating its effectiveness in spatio-temporal data analysis.

DIST

Li, J., Gou, S., Li, R., He, H., Li, W., & Zheng, Y. (2025). DiST: Efficient Distributed Spatio-Temporal Clustering With Automatic Parameter Optimization. IEEE Transactions on Knowledge and Data Engineering.