Go to content

The impact of two data-generating processes for competing risk data on the discrimination and calibration of two types of competing risk regression models

Share

Monte Carlo simulations are an important tool in modern statistical research. The data-generating process is foundational to any simulation. In survival analysis, a competing risk is an event whose occurrence precludes the occurrence of the primary event of interest. Two data-generating processes have been described for simulating competing risk data: one based on all the cause-specific hazard functions for the different types of events, and one based on a subdistribution hazard model for the primary event of interest. There is a paucity of research on the impact of the choice of data-generating process. We used a series of Monte Carlo simulations to evaluate the impact of the choice of data-generating process on the performance of prediction models when assessing discrimination using the time-dependent AUC and accuracy using the time-dependent Brier score. We also assessed the impact of the choice of competing risk regression used for computing smoothed event probabilities for use when computing the calibration metrics ICI (integrated calibration index), E50, and E90. The impact of discordance between the fitted model and the data-generating process on both the time-dependent AUC and the time-dependent Brier score was minimal. When computing the ICI, E50, and E90, we recommend that researchers use a model for computing smoothed event probabilities that is concordant with the type of model whose calibration is being assessed.

Information

Citation

Austin PC, Putter H. Stat Med. 2026; 45(6-7):e70468.

View Source

Contributing ICES Scientists

Research Programs

Associated Topics

Associated Sites