Relational data clustering with incomplete data

Richard J. Hathaway, Dessa D. Overstreet, Thomas E. Murphy, James C. Bezdek

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the problem of clustering a set of objects which are represented by relational data in the form of a dissimilarity matrix which has missing values. Three methods are developed to estimate the missing values, all based on simple triangle inequality-based approximation schemes. With few exceptions, any relational clustering algorithm can then be applied to the completed data matrix to obtain nice clusters. We illustrate our approach by clustering incomplete data built from several data sets. The primary clustering method chosen for our numerical experiments is the non-Euclidean relational fuzzy c-means algorithm. Our examples show that satisfactory clusters can still be obtained even when roughly half of the distance values are missing before completion.

Original languageEnglish
Pages (from-to)273-280
Number of pages8
JournalProceedings of SPIE - The International Society for Optical Engineering
Volume4390
DOIs
StatePublished - 2001

Keywords

  • C-means clustering
  • Clustering
  • Dissimilarity data
  • Incomplete data
  • Missing data
  • Pattern recognition
  • Relational data

Fingerprint

Dive into the research topics of 'Relational data clustering with incomplete data'. Together they form a unique fingerprint.

Cite this