Efficient and Privacy-Preserving Distribution Statistics Analytics on Mobile Spatial Data
移动空间数据的高效隐私保护分布统计分析
Xuhao Ren, Chuan Zhang, Mingyang Zhao, Meng Li, Liehuang Zhu, Bin Xiao
AI总结 针对移动空间数据分布统计中的隐私保护问题,提出基于非共谋服务器和分布式点函数(DPF)的eSpat-B方案及更高效的eSpat+方案,在保证100%准确率的同时显著降低计算和通信开销。
详情
随着移动计算技术的快速发展,来自各种移动终端和传感设备(如智能手机、联网车辆和无人机)的海量空间数据不断产生。对这些数据进行高效的分布式统计分析对于实时移动计算应用至关重要。然而,移动环境的受限性和动态性加剧了隐私挑战:集中敏感数据进行分析会带来严重的隐私泄露风险,而现有的隐私保护技术往往引入过高的开销或不准确性。在本文中,我们设计、实现并评估了首个支持移动空间数据高效且隐私保护的分布统计分析的方案。首先,我们提出了eSpat-B,它利用两个非共谋服务器和新设计的改进分布式点函数(DPF)结合八叉树划分。此外,考虑到空间数据的频繁更新,我们提出了另一种更高效的方案eSpat+。该方案的核心思想是利用K维树进行空间划分,结合增量DPF进行统计分析,并设计高效的更新算法。安全性分析表明,我们的方案在整个统计过程中有效保护了数据隐私。在真实移动轨迹数据集上的理论分析和实验结果表明,我们提出的方案在计算开销上降低了约1.2倍,通信开销上降低了约20倍,并保持了100%的准确率。
With the rapid development of mobile computing technology, massive amounts of spatial data are continuously generated from various mobile terminals and sensing devices, such as smartphones, connected vehicles, and drones. Performing efficient distributed statistical analysis on this data is crucial for real-time mobile computing applications. However, the constrained and dynamic nature of mobile environments exacerbates the privacy challenge: centralizing sensitive data for analysis risks severe privacy leaks, while existing privacy-preserving techniques often introduce excessive overhead or inaccuracies. In this paper, we design, implement, and evaluate the first system that supports efficient and privacy-preserving distribution statistics analysis for mobile spatial data. First, we propose eSpat-B, which leverages two non-colluding servers and a newly designed improved distributed point functions (DPF) with octree partitioning. Furthermore, considering the frequent updates of spatial data, we propose another more efficient scheme, eSpat+. The core idea of this scheme is to utilize a K-Dimensional tree for spatial partitioning, combine it with incremental DPF for performing statistics analysis, and design an efficient update algorithm. Security analysis demonstrates that our schemes effectively protect data privacy throughout the statistical process. Extensive experiments on real-world trajectory datasets demonstrate that the proposed schemes significantly outperform existing approaches, reducing computation overhead by up to 1.2x and communication overhead by up to 20x while maintaining 100% statistical accuracy.