site map |  contact us |  home
  Search Help
Printable Version
About Us Publications Work In Progress Education and Events Privacy Information for Scientists  
  View publications
  |
 

A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality

Austin P. A comparison of regression trees, logistic regression, generalized additive models, and multivariate adaptive regression splines for predicting AMI mortality. Stat Med.  2007; 26 (15): 2937-2957.

Clinicians and health service researchers are frequently interested in predicting patient-specific probabilities of adverse events (e.g., death, disease recurrence, post-operative complications, and hospital readmission).  There is an increasing interest in the use of classification and regression trees (CART) for predicting outcomes in clinical studies.

 

This study compared the predictive accuracy of logistic regression with that of regression trees for predicting mortality after hospitalization with an acute myocardial infarction (AMI).  Investigators also examined the predictive ability of two other types of data-driven models: generalized additive models (GAMs) and multivariate adaptive regression splines (MARS).  They used data on 9,484 patients admitted to hospital with an AMI in Ontario.  Repeated split-sample validation was used: the data were randomly divided into derivation and validation samples.  Predictive models were estimated using the derivation sample and the predictive accuracy of the resultant model was assessed using the area under the receiver operating characteristic (ROC) curve in the validation sample.  This process was repeated 1,000 times - the initial data set was randomly divided into derivation and validation samples 1,000 times, and the predictive accuracy of each method was assessed each time.

 

The mean ROC curve area for the regression tree models in the 1,000 derivation samples was 0.762, while the mean ROC curve area of a simple logistic regression model was 0.845.  The mean ROC curve areas for the other methods ranged from a low of 0.831 to a high of 0.851.

 

This study shows that regression trees do not perform as well as logistic regression for predicting mortality following AMI.  However, the logistic regression model had performance comparable to that of more flexible, data-driven models such as GAMs and MARS.



About Us Publications Work In Progress Education and Events Privacy Information for Scientists  
© ICES 2010 Terms of Use