Skip to main content

Comparing the high-dimensional propensity score for use with administrative data with propensity scores derived from high-quality clinical data

Austin PC, Wu CF, Lee DS, Tu JV. Stat Methods Med Res. 2019; Apr 11 [Epub ahead of print]. DOI: 10.1177/0962280219842362.


Administrative healthcare databases are increasingly being used for research purposes. When used to estimate the effects of treatments and interventions, an important limitation of these databases is the lack of information on important confounding variables. The high-dimensional propensity score (hdPS) is an algorithm that generates a large number of empirically-derived covariates using administrative healthcare databases. The hdPS has been described as enabling adjustment by proxy, in which a large number of empirically-derived covariates may serve as proxies for unmeasured confounding variables. We examined the validity of this assumption using samples of patients hospitalized with acute myocardial infarction (AMI) and congestive heart failure (CHF), for whom both administrative data and detailed clinical data were available. We considered three treatments in AMI patients: angiotensin-converting enzyme inhibitors, beta-blockers, and statins, while the first two treatments were also considered in CHF patients. We considered three propensity scores: (a) one derived using detailed clinical data; (b) the hdPS derived from administrative data; and (c) one derived from administrative data using expert opinion. Using each propensity score, we estimated inverse probability of treatment (IPT) weights. For each sample and treatment combination, and for each of the two propensity scores derived using administrative data, there were clinical variables not measured in administrative data that remained imbalanced after incorporating the IPT weights. However, the propensity score derived using clinical data always resulted in all clinical variables being balanced. When estimating hazard ratios, for some samples and treatment combinations, the hazard ratios estimated using the hdPS were more similar to those obtained using the clinical propensity score than were those obtained using the expert-derived propensity score. However, for other combinations, the effects estimated using the expert-derived propensity score were more similar to those obtained using the clinical propensity score than were those derived using the hdPS.

View full text

×