Abstract
We consider the problem of clustering a set of objects which are represented by relational data in the form of a dissimilarity matrix which has missing values. Three methods are developed to estimate the missing values, all based on simple triangle inequality-based approximation schemes. With few exceptions, any relational clustering algorithm can then be applied to the completed data matrix to obtain nice clusters. We illustrate our approach by clustering incomplete data built from several data sets. The primary clustering method chosen for our numerical experiments is the non-Euclidean relational fuzzy c-means algorithm. Our examples show that satisfactory clusters can still be obtained even when roughly half of the distance values are missing before completion.
Original language | English |
---|---|
Pages (from-to) | 273-280 |
Number of pages | 8 |
Journal | Proceedings of SPIE - The International Society for Optical Engineering |
Volume | 4390 |
DOIs | |
State | Published - 2001 |
Keywords
- C-means clustering
- Clustering
- Dissimilarity data
- Incomplete data
- Missing data
- Pattern recognition
- Relational data