The failure of four bootstrap procedures for estimating confidence intervals for predicted-to-expected ratios for hospital profiling

Background — Healthcare provider profiling involves the comparison of outcomes between patients cared for by different healthcare providers. An important component of provider profiling is risk-adjustment so that providers that care for sicker patients are not unfairly penalized. One method for provider profiling entails using random effects logistic regression models to compute provider-specific predicted-to-expected ratios. These ratios compare the predicted number of deaths at a given provider given the case-mix of its patients with the expected number of deaths had those patients been treated at an average provider. Despite the utility of this metric in provider profiling, methods have not been described to estimate confidence intervals for these ratios. The objective of the current study was to evaluate the performance of four bootstrap procedures for estimating 95% confidence intervals for predicted-to-expected ratios.

Methods — We used Monte Carlo simulations to evaluate four bootstrap procedures: the naïve bootstrap, a within cluster-bootstrap, the parametric multilevel bootstrap, and a novel cluster-specific parametric bootstrap. The parameters of the data-generating process were informed by empirical analyses of patients hospitalized with acute myocardial infarction. Three factors were varied in the simulations: the number of subjects per cluster, the intraclass correlation coefficient for the binary outcome, and the prevalence of the outcome. We examined coverage rates of both normal-theory bootstrap confidence intervals and bootstrap percentile intervals.

Results — In general, all four bootstrap procedures resulted in inaccurate estimates of the standard error of cluster-specific predicted-to-expected ratios. Similarly, all four bootstrap procedures resulted in 95% confidence intervals whose empirical coverage rates were different from the advertised rate. In many scenarios the empirical coverage rates were substantially lower than the advertised rate.

Conclusion — Existing bootstrap procedures should not be used to compute confidence intervals for predicted-to-expected ratios when conducting provider profiling.

View Source

Information

Citation

Austin PC. BMC Med Res Methodol. 2022; 14;22(1):271. Epub 2022 Oct 14.

View Source

Discover More

Journal Article

11/07/2025

Sleep medicine resource utilization in individuals with Parkinson disease: a population study of health administrative data

Gotfrit R, Talarico R, Gros P, Kaminska M, Mestre TA, Kendzerska T. Neurol Clin Pract. 2025; 15(4):e200511. Epub 2025 Jul 11.

Journal Article

11/07/2025

Measles seroprevalence among individuals serologically tested in Ontario, Canada

Ariyarajah A, Crowcroft NS, Brown KA, Wang J, Kwong JC, Bolotin S. Vaccine. 2025; 62:127446. Epub 2025 Jul 11.

Journal Article

11/07/2025

Emergency department visits for pediatric concussion by material deprivation, age, and sex, in Ontario, Canada, 2010-2020: a population-based study

Macpherson A, Harkins J, Sergio L, Sadrmanesh O, Emery C, Rothman L. Inj Prev. 2025 Jul 11.

See All

The failure of four bootstrap procedures for estimating confidence intervals for predicted-to-expected ratios for hospital profiling

Information

Citation

Contributing ICES Scientists

Research Programs

Associated Sites

Discover More

Sleep medicine resource utilization in individuals with Parkinson disease: a population study of health administrative data

Measles seroprevalence among individuals serologically tested in Ontario, Canada

Emergency department visits for pediatric concussion by material deprivation, age, and sex, in Ontario, Canada, 2010-2020: a population-based study