Go to content

Do diagnostic and procedure codes within population-based, administrative datasets accurately identify patients with rectal cancer?


Background — Procedural and diagnostic codes may inaccurately identify specific patient populations within administrative datasets.

Purpose — Measure the accuracy of previously used coding algorithms using administrative data to identify patients with rectal cancer resections (RCR).

Methods — Using a previously published coding algorithm, we re-created a RCR cohort within administrative databases, limiting the search to a single institution. The accuracy of this cohort was determined against a gold standard reference population. A systematic review of the literature was then performed to identify studies that use similar coding methods to identify RCR cohorts and whether or not they comment on accuracy.

Results — Over the course of the study period, there were 664,075 hospitalizations at our institution. Previously used coding algorithms identified 1131 RCRs (administrative data incidence 1.70 per 1000 hospitalizations). The gold standard reference population was 821 RCR over the same period (1.24 per 1000 hospitalizations). Administrative data methods yielded a RCR cohort of moderate accuracy (sensitivity 89.5%, specificity 99.9%) and poor positive predictive value (64.9%). Literature search identified 18 studies that utilized similar coding methods to derive a RCR cohort. Only 1/18 (5.6%) reported on the accuracy of their study cohort.

Conclusions — The use of diagnostic and procedure codes to identify RCR within administrative datasets may be subject to misclassification bias because of low PPV. This underscores the importance of reporting on the accuracy of RCR cohorts derived within population-based datasets.



Musselman RP, Gomes T, Rothwell DM, Auer RC, Moloo H, Boushey RP, van Walraven C. J Gastrointest Surg. 2019; 23(2):367-76. Epub 2018 Dec 3.

Contributing ICES Scientists

Research Programs

Associated Sites