Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models

We conducted an extensive set of empirical analyses to examine the effect of the number of events per variable (EPV) on the relative performance of three different methods for assessing the predictive accuracy of a logistic regression model: apparent performance in the analysis sample, split-sample validation, and optimism correction using bootstrap methods. Using a single dataset of patients hospitalized with heart failure, we compared the estimates of discriminatory performance from these methods to those for a very large independent validation sample arising from the same population. As anticipated, the apparent performance was optimistically biased, with the degree of optimism diminishing as the number of events per variable increased. Differences between the bootstrap-corrected approach and the use of an independent validation sample were minimal once the number of events per variable was at least 20. Split-sample assessment resulted in too pessimistic and highly uncertain estimates of model performance. Apparent performance estimates had lower mean squared error compared to split-sample estimates, but the lowest mean squared error was obtained by bootstrap-corrected optimism estimates. For bias, variance, and mean squared error of the performance estimates, the penalty incurred by using split-sample validation was equivalent to reducing the sample size by a proportion equivalent to the proportion of the sample that was withheld for model validation. In conclusion, split-sample validation is inefficient and apparent performance is too optimistic for internal validation of regression-based prediction models. Modern validation methods, such as bootstrap-based optimism correction, are preferable. While these findings may be unsurprising to many statisticians, the results of the current study reinforce what should be considered good statistical practice in the development and validation of clinical prediction models.

View Source

Information

Citation

Austin PC, Steyerberg EW. Stat Methods Med Res. 2017; 26(2):796-808. Epub 2014 Nov 19.

View Source

Discover More

Journal Article

29/04/2024

The performance of marginal structural models for estimating risk differences and relative risks using weighted univariate generalized linear models

Austin PC. Stat Methods Med Res. 2024; Apr 24 [Epub ahead of print].

Journal Article

22/03/2024

Validation of case-ascertainment algorithms using health administrative data to identify people who inject drugs in Ontario, Canada

Greenwald ZR, Werb D, Feld JJ, Austin PC, Fridman D, Bayoumi AM, Gomes T, Kendall CE, Lapointe-Shaw L, Scheim AI, Bartlett SR, Benchimol EI, Bouck Z, Boucher LM, Greenaway C, Janjua NZ, Leece P, Wong WWL, Sander B, Kwong JC. J Clin Epidemiol. 2024; Mar 22 [Epub ahead of print].

Journal Article

21/03/2024

Development of the multivariate administrative data cystectomy model and its impact on misclassification bias

Ross J, Lavallee LT, Hickling D, van Walraven C. BMC Med Res Methodol. 2024; 24(1):73. Epub 2024 Mar 21.

See All

Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models

Information

Citation

Contributing ICES Scientists

Research Programs

Associated Topics

Associated Sites

Discover More

The performance of marginal structural models for estimating risk differences and relative risks using weighted univariate generalized linear models

Validation of case-ascertainment algorithms using health administrative data to identify people who inject drugs in Ontario, Canada

Development of the multivariate administrative data cystectomy model and its impact on misclassification bias