Abstract
Multiple imputation is a commonly used method when addressing the issue of missing values. Hot deck imputation is distinctively different than others to ensure closeness to true variance in estimating the regression coefficients as it involves the replacement of unobserved values by observed values in similar units or cells. These cells are determined in terms of the closeness of each observation using various distance measures. But most of the distance measures can only be applied to continuous variables. Thus, there is a distinct problem when there are categorical covariates in the dataset. We proposed for a model based clustering procedure that uses a parsimonious covariance structure of the latent variable, following a mixture of Gaussian distributions to generate the imputation cells of mixed type dataset (i.e. datasets with continuous and categorical variables). The results of the simulated data showed demonstrated lower variance compared to the complete cases in estimation of regression coefficients.
Original language | American English |
---|---|
State | Published - Mar 13 2017 |
Event | Eastern North American Region International Biometric Society Spring Meeting (ENAR) - Duration: Mar 25 2018 → … |
Conference
Conference | Eastern North American Region International Biometric Society Spring Meeting (ENAR) |
---|---|
Period | 03/25/18 → … |
Keywords
- Hot Deck
- Model Based Clustering
- Typed Datasets
DC Disciplines
- Biostatistics
- Public Health