A comparison of variance estimators for logistic regression models estimated using generalized estimating equations (GEE) in the context of observational health services research
Austin PC. Stat Med. 2024; Oct 31 [Epub ahead of print].
Objective — To compare the predictive accuracy of regression trees with that of logistic regression models for predicting in-hospital mortality in patients hospitalized with heart failure.
Study Design and Setting — Models were developed in 8,236 patients hospitalized with heart failure between April 1999 and March 2001. Models included the Enhanced Feedback for Effective Cardiac Treatment and Acute Decompensated Heart Failure National Registry (ADHERE) regression models and tree. Predictive accuracy was assessed using 7,608 patients hospitalized between April 2004 and March 2005.
Results — The area under the receiver operating characteristic curve for five different logistic regression models ranged from 0.747 to 0.775, whereas the corresponding values for three different regression trees ranged from 0.620 to 0.651. For the regression trees grown in 1,000 random samples drawn from the derivation sample, the number of terminal nodes ranged from 1 to 6, whereas the number of variables used in specific trees ranged from 0 to 5. Three different variables (blood urea nitrogen, dementia, and systolic blood pressure) were used for defining the first binary split when growing regression trees.
Conclusion — Logistic regression predicted in-hospital mortality in patients hospitalized with heart failure more accurately than did the regression trees. Regression trees grown in random samples from the same data set can differ substantially from one another.
Austin PC, Tu JV, Lee DS. J Clin Epidemiol. 2010; 63(10):1145-55. Epub 2010 Mar 21.
The ICES website uses cookies. If that’s okay with you, keep on browsing, or learn more about our Privacy Policy.