Go to content

Development and validation of a machine learning algorithm predicting emergency department use and unplanned hospitalization in patients with head and neck cancer


Importance — Patient-reported symptom burden was recently found to be associated with emergency department use and unplanned hospitalization (ED/Hosp) in patients with head and neck cancer. It was hypothesized that symptom scores could be combined with administrative health data to accurately risk stratify patients.

Objective — To develop and validate a machine learning approach to predict future ED/Hosp in patients with head and neck cancer.

Design, Setting, and Participants — This was a population-based predictive modeling study of patients in Ontario, Canada, diagnosed with head and neck cancer from January 2007 through March 2018. All outpatient clinical encounters were identified. Edmonton Symptom Assessment System (ESAS) scores and clinical and demographic factors were abstracted. Training and test cohorts were randomly generated in a 4:1 ratio. Various machine learning algorithms were explored, including (1) logistic regression using a least absolute shrinkage and selection operator, (2) random forest, (3) gradient boosting machine, (4) k-nearest neighbors, and (5) an artificial neural network. Data analysis was performed from September 2021 to January 2022.

Main Outcomes and Measures — The main outcome was any 14-day ED/Hosp event following symptom assessment. The performance of each model was assessed on the test cohort using the area under the receiver operator characteristic (AUROC) curve and calibration plots. Shapley values were used to identify the variables with greatest contribution to the model.

Results — The training cohort consisted of 9409 patients (mean [SD] age, 63.3 [10.9] years) undergoing 59 089 symptom assessments (80%). The remaining 2352 patients (mean [SD] age, 63.3 [11] years) and 14 193 symptom assessments were set aside as the test cohort (20%). Several models had high predictive accuracy, particularly the gradient boosting machine (validation AUROC, 0.80 [95% CI, 0.78-0.81]). A Youden-based cutoff corresponded to a validation sensitivity of 0.77 and specificity of 0.66. Patient-reported symptom scores were consistently identified as being the most predictive features within models. A second model built only with symptom severity data had an AUROC of 0.72 (95% CI, 0.70-0.74).

Conclusions and Relevance — In this study, machine learning approaches predicted with a high degree of accuracy ED/Hosp in patients with head and neck cancer. These tools could be used to accurately risk stratify patients and may help direct targeted intervention.



Noel CW, Sutradhar R, Conn LG, Forner D, Chan WC, Fu R, Hallet J, Coburn NG, Eskander A. JAMA Otolaryngol Head Neck Surg. 2022; 148(8):764-72. Epub 2022 Jun 30.

View Source

Research Programs

Associated Sites