Go to content

Pilot study of the ability to probabilistically link clinical trial patients to administrative data and determine long-term outcomes


Background — Clinical trials are important but extremely costly. Utilization of routinely collected administrative data may simplify and enhance clinical trial data collection.

Purpose — The aim of this study was to test the feasibility of use of administrative databases in Ontario, Canada, for long-term clinical trial follow-up, specifically (a) to determine whether limited patient identifiers held by the Canadian Cancer Trials Group can be used to probabilistically link with individuals in the Institute for Clinical Evaluative Sciences databases and if so, (b) the level of concordance between the two data sets.

Methods — This retrospective study was conducted through collaboration of established health service (Institute for Clinical Evaluative Sciences) and clinical trial (Canadian Cancer Trials Group) research groups in the province of Ontario, Canada, where healthcare is predominantly funded by the government. Adults with pre-treated metastatic colorectal cancer previously enrolled in the Canadian Cancer Trials Group CO.17 and CO.20 randomized phase III trials were included, limited to those in Ontario. The main outcomes were rate of successful probabilistic linkage and concordance of survival data, stated a priori.

Results — Probabilistic linkage was successful in 266/293 (90.8%) participants. In those patients for whom linkage was successful, the Canadian Cancer Trials Group (trial) and the Institute for Clinical Evaluative Sciences (administrative) data sets were concordant with regard to the occurrence of death during the period of clinical trial follow-up in 206/209 (98.6%). Death was recorded in the Institute for Clinical Evaluative Sciences, but not the Canadian Cancer Trials Group, for 57 cases, where the event occurred after the clinical trial cut-off dates. The recorded date of death matched closely between both databases. During the period of clinical trial conduct, administrative databases contained details of hospitalizations and emergency room visits not captured in the clinical trial electronic database.

Conclusion — Prospective use of administrative data could enhance clinical trial data collection, both for long-term follow-up and resource utilization for economic analyses and do so less expensively than current primary data collection. Recording a unique identifier (e.g. health insurance number) in trial databases would allow deterministic linkage for all participants.



Hay AE, Pater JL, Corn E, Han L, Camacho X, O'Callaghan C, Chong N, Bell EN, Tu D, Earle CC. Clin Trials. 2019; 16(1):14-7. Epub 2018 Nov 22.

Contributing ICES Scientists

Research Programs

Associated Sites