Skip to main content

Using big data for cardiovascular health surveillance: insights from 10.3 million individuals in the CANHEART cohort

Chu A, Hennessy DA, Johnston S, Udell JA, Lee DS, Jia J, Tu JV, Ko DT. Can J Cardiol. 2022; Jun 12 [Epub ahead of print]. DOI:

Background — The increasing availability of large electronic population-based databases offers unique opportunities to conduct cardiovascular health surveillance traditionally done using surveys. We aimed to examine cardiovascular risk factor burden, preventive care, and disease incidence among adults in Ontario, Canada using routinely collected data, and compare estimates with health survey data.

Methods — In the CArdiovascular HEalth in Ambulatory Care Research Team (CANHEART) initiative, multiple health administrative databases were linked to create a population-based cohort of 10.3 million adults without a history of cardiovascular disease. We examined cardiovascular risk factor burden and screening, and outcomes between 2016 and 2020. Risk factor burden was also compared with cycles 3-5 (2012-17) of the Canadian Health Measures Survey (CMHS), which included 9,473 participants across Canada.

Results — Mean age of our study cohort was 47.9±17.0 years and 52.0% were women. Lipid and diabetes assessment rates among individuals 40-79 years were 76.6% and 78.2%, respectively, and lowest among men 40-49 years. Total cholesterol levels and diabetes and hypertension rates among men and women 20-79 years were similar to CHMS findings (total cholesterol: 4.80/4.98 versus 4.94/5.25 mmol/L; diabetes: 8.2%/7.1% versus 8.1%/6.0%; hypertension: 21.4%/21.6% versus 23.9%/23.1%, respectively); however, CANHEART individuals had slightly higher mean glucose (men: 5.79 versus 5.44; women: 5.39 versus 5.09 mmol/L) and systolic blood pressures (men: 126.2 versus 118.3; women 120.6 versus 115.7 mmHg).

Conclusions — Cardiovascular health surveillance is possible through linkage of routinely collected electronic population-based datasets. However, further investigation is needed to understand differences between health administrative and survey measures cross-sectionally and over time.