Objectives — To determine whether studies that used propensity score (PS) methods in the urology literature provide sufficient detail to allow scientific reproducibility and whether appropriate statistical tests were used to obtain valid measures of effect.
Materials and Methods — We searched OVID Medline and the Science Citation Index from inception to November 2016 to identify studies that used PS methods in five general urology journals. From each included article, we extracted pertinent information related to the PS methodology, such as estimation of the PS, whether balance diagnostics were performed, and the statistical analysis performed.
Results — We identified 114 articles for inclusion. Matching on the PS was the most common method used (62 studies, 54.4%). Of all studies, 103 (90.4%) described which covariates were used to estimate the PS; however, only 24 provided justification for the selected covariates. Although the majority of studies (70.2%) performed some sort of diagnostic evaluation to assess balance, few studies (24.6%) used appropriate methods for balance assessment. Only four (6.4%) studies that used PS matching provided sufficient detail to replicate the matching strategy. Finally, the majority (77.4%) of studies that used PS matching explicitly used inappropriate statistical methods to estimate the effect of an exposure on an outcome.
Conclusions — In the urology literature PS methods were poorly described and implemented. We provide recommendations for improvement to allow scientific reproducibility and obtain valid measures of effect from their use.
View full text