Go to content

Imputation of incomplete ordinal and nominal data by predictive mean matching

Share

Multivariate imputation using chained equations is a popular algorithm for imputing missing data that entails specifying multivariable models through conditional distributions. Two standard imputation methods for imputing missing continuous variables are parametric imputation using a linear model and predictive mean matching. The default methods for imputing missing categorical variables are parametric imputation using multinomial logistic regression and ordinal logistic regression for imputing nominal and ordinal categorical variables, respectively. There is a paucity of research into the relative computational burden and the quality of statistical inferences when using predictive mean matching versus parametric imputation for imputing missing non-binary categorical variables. We used simulations to compare the performance of predictive mean matching with that of multinomial logistic regression and ordinal logistic regression for imputing categorical variables when the analysis model of scientific interest was a logistic or linear regression model. We varied the sample size (N = 500, 1000, 2500, and 5000), the rate of missing data (5%–50% in increments of 5%), and the number of levels of the categorical variable (3, 4, 5, and 6). In general, the performance of predictive mean matching compared very favorably to that of multinomial or ordinal logistic regression for imputing categorical variables when the analysis model was a logistic or linear regression model. This was true across a range of scenarios defined by sample size and the rate of missing data. Furthermore, the use of predictive mean matching was substantially faster, by a factor of 2–6. In conclusion, predictive mean matching can be used to impute categorical variables. The use of predictive mean matching to impute missing non-binary categorical variables substantially reduces computer processing time when conducting multiple imputation.

Information

Citation

Austin PC, van Buuren S. Stat Methods Med Res. 2025; Aug 17 [Epub ahead of print].

View Source

Contributing ICES Scientists

Research Programs

Associated Topics

Associated Sites