Privacy-Preserving and Outsourced Multi-User K-Means Clustering
Many techniques for privacy-preserving data mining (PPDM) have been investigated over the past decade. Such techniques, however, usually incur heavy computational and communication cost on the participating parties and thus entities with limited resources may have to refrain from participating in the PPDM process. To address this issue, one promising solution is to outsource the tasks to the cloud environment. In this paper, we propose a novel and efficient solution to privacy-preserving outsourced distributed clustering (PPODC) for multiple users based on the k-means clustering algorithm. The main novelty of our solution lies in avoiding the secure division operations required in computing cluster centers through efficient transformation techniques. In addition, we discuss two strategies, namely offline computation and pipelined execution that aim to boost the performance of our protocol. We implement our protocol on a cluster of 16 nodes and demonstrate how our two strategies combined with parallelism can significantly improve the performance of our protocol through extensive experiments using a real dataset.
MSU Digital Commons Citation
Rao, Fang Yu; Samanthula, Bharath Kumar; Bertino, Elisa; Yi, Xun; and Liu, Dongxi, "Privacy-Preserving and Outsourced Multi-User K-Means Clustering" (2016). Department of Computer Science Faculty Scholarship and Creative Works. 490.