Abstract—We consider the problem of data clustering on streamed data, when the number of transactions is growing very quickly, or when data is distributed among several parties and their privacy is a concern. In this paper we present two new protocols for incremental privacy-preserving k-means clustering, which is a very popular data mining method, when data is distributed, horizontally or vertically, among multiple parties. At the end of each protocol, each party, without revealing its own private data, receives the final result of the clustering algorithm. Also, to improve efficiency, previous knowledge is used to incrementally update the centers and membership of each cluster.
Index Terms—Clustering, security and privacy-preserving, incremental algorithms, data mining and machine learning, distributed data structures.
Saeed Samet is with the Faculty of Medicine, Memorial University of Newfoundland, St. John's, NL, Canada (e-mail: ssamet@mun.ca).
Ali Miri is with the Department of Computer Science, Ryerson University, Toronto, ON, Canada (e-mail: Ali.Miri@ryerson.ca).
[PDF]
Cite: Saeed Samet and Ali Miri, "New Incremental Privacy-Preserving Clustering Protocols," Lecture Notes on Software Engineering vol. 1, no. 3, pp. 244-248, 2013.