Skip to main content

Validation of a type 1 diabetes algorithm using electronic medical records and administrative healthcare data to study the population incidence and prevalence of type 1 diabetes in Ontario, Canada

Weisman A, Tu K, Young J, Kumar M, Austin PC, Jaakkimainen L, Lipscombe L, Aronson R, Booth GL. BMJ Open Diabetes Res Care. 2020; 8(1):e001224. Epub 2020 Jun 21. DOI:

Introduction — We aimed to develop algorithms distinguishing type 1 diabetes (T1D) from type 2 diabetes in adults ≥18 years old using primary care electronic medical record (EMRPC) and administrative healthcare data from Ontario, Canada, and to estimate T1D prevalence and incidence.

Research Design and Methods — The reference population was a random sample of patients with diabetes in EMRPC whose charts were manually abstracted (n=5402). Algorithms were developed using classification trees, random forests, and rule-based methods, using electronic medical record (EMR) data, administrative data, or both. Algorithm performance was assessed in EMRPC. Administrative data algorithms were additionally evaluated using a diabetes clinic registry with endocrinologist-assigned diabetes type (n=29 371). Three algorithms were applied to the Ontario population to evaluate the minimum, moderate and maximum estimates of T1D prevalence and incidence rates between 2010 and 2017, and trends were analyzed using negative binomial regressions.

Results — Of 5402 individuals with diabetes in EMRPC, 195 had T1D. Sensitivity, specificity, positive predictive value and negative predictive value for the best performing algorithms were 80.6% (75.9–87.2), 99.8% (99.7–100), 94.9% (92.3–98.7), and 99.3% (99.1–99.5) for EMR, 51.3% (44.0–58.5), 99.5% (99.3–99.7), 79.4% (71.2–86.1), and 98.2% (97.8–98.5) for administrative data, and 87.2% (81.7–91.5), 99.9% (99.7–100), 96.6% (92.7–98.7) and 99.5% (99.3–99.7) for combined EMR and administrative data. Administrative data algorithms had similar sensitivity and specificity in the diabetes clinic registry. Of 11 499 711 adults in Ontario in 2017, there were 24 789 (0.22%, minimum estimate) to 102 140 (0.89%, maximum estimate) with T1D. Between 2010 and 2017, the age-standardized and sex-standardized prevalence rates per 1000 person-years increased (minimum estimate 1.7 to 2.56, maximum estimate 7.48 to 9.86, p<0.0001). In contrast, incidence rates decreased (minimum estimate 0.1 to 0.04, maximum estimate 0.47 to 0.09, p<0.0001).

Conclusions — Primary care EMR and administrative data algorithms performed well in identifying T1D and demonstrated increasing T1D prevalence in Ontario. These algorithms may permit the development of large, population-based cohort studies of T1D.

View full text