Abstract
The ever-growing demand for electrical energy of sensing devices in the Internet of Things (IoT) has led to generating large amounts of electricity consumption data. Electricity service providers often use wireless sensor networks to collect sensing devices' electricity consumption data for statistical analysis, so as to provide sensing devices with improved electrical services. As an important data mining technique, while data clustering excels in dealing with such massive data, it imposes the risk of privacy disclosure in the process of data clustering. In an effort of solving this problem, Blum et al. proposed a differential privacy k-means algorithm, effectively preventing privacy disclosure. However, the availability of data clustering results is reduced due to the data distortion in Blum's algorithm. In this paper, we propose a privacy and availability data clustering (PADC) scheme based on k-means algorithm and differential privacy, which enhances the selection of the initial center points and the distance calculation method from other points to center point. Moreover, PADC attempts to reduce the outlier effect through detecting outliers during the clustering process. Security analysis indicates that our scheme satisfies the goal of differential privacy and prevents privacy information disclosure. Meanwhile, performance evaluation shows that our scheme, at the same privacy level, improves the availability of clustering results compared to the existing differential privacy k-means algorithms, suggesting that our proposed PADC scheme outperforms others for intelligent electrical service in IoT.
Original language | English |
---|---|
Article number | 8370699 |
Pages (from-to) | 1530-1540 |
Number of pages | 11 |
Journal | IEEE Internet of Things Journal |
Volume | 6 |
Issue number | 2 |
DOIs | |
State | Published - Apr 2019 |
Scopus Subject Areas
- Signal Processing
- Information Systems
- Hardware and Architecture
- Computer Science Applications
- Computer Networks and Communications
Keywords
- Data clustering
- Internet of Things (IoT)
- differential privacy
- k-means algorithm
- privacy protection