Skip to main content

Estimating linear regression models in the presence of a censored independent variable

Austin PC, Hoch JS. Stat Med. 2004; 23(3):411-29.

The current study examined the impact of a censored independent variable, after adjusting for a second independent variable, when estimating regression coefficients using "naïve" ordinary least squares (OLS), "partial" OLS and full-likelihood models. We used Monte Carlo simulations to determine the bias associated with all three regression methods. We demonstrated that substantial bias was introduced in the estimation of the regression coefficient associated with the variable subject to a ceiling effect when naïve OLS regression was used. Furthermore, minor bias was transmitted to the estimation of the regression coefficient associated with the second independent variable. High correlation between the two independent variables improved estimation of the censored variable's coefficient at the expense of estimation of the other coefficient. The use of "partial" OLS and maximum-likelihood estimation were shown to result in, at most, negligible bias in estimation. Furthermore, we demonstrated that the full-likelihood method was robust under mis-specification of the joint distribution of the independent random variables. Lastly, we provided an empirical example using National Population Health Survey (NPHS) data to demonstrate the practical implications of our main findings and the simple methods available to circumvent the bias identified in the Monte Carlo simulations. Our results suggest that researchers need to be aware of the bias associated with the use of naïve ordinary least-squares estimation when estimating regression models in which at least one independent variable is subject to a ceiling effect.

Keywords: Research and statistical methods