StrokeClassifier: Ischemic Stroke Etiology Classification by Ensemble Consensus Modeling Using Electronic Health Records. uri icon

Overview

abstract

  • Determining the etiology of an acute ischemic stroke (AIS) is fundamental to secondary stroke prevention efforts but can be diagnostically challenging. We trained and validated an automated classification machine intelligence tool, StrokeClassifier , using electronic health record (EHR) text data from 2,039 non-cryptogenic AIS patients at 2 academic hospitals to predict the 4-level outcome of stroke etiology determined by agreement of at least 2 board-certified vascular neurologists' review of the stroke hospitalization EHR. StrokeClassifier is an ensemble consensus meta-model of 9 machine learning classifiers applied to features extracted from discharge summary texts by natural language processing. StrokeClassifier was externally validated in 406 discharge summaries from the MIMIC-III dataset reviewed by a vascular neurologist to ascertain stroke etiology. Compared with stroke etiologies adjudicated by vascular neurologists, nine base classifiers performed well with a mean cross-validated area under the receiver operating curve (AUCROC) of 0.90. Their ensemble meta-model, StrokeClassifier , achieved a mean cross-validated accuracy of 0.74 and weighted F1 of 0.74. In the MIMIC-III cohort, the accuracy and weighted F1 of StrokeClassifier were 0.70, and 0.71, respectively. SHapley Additive exPlanation analysis revealed that the top 5 features contributing to stroke etiology prediction were atrial fibrillation, age, middle cerebral artery occlusion, internal carotid artery occlusion, and frontal stroke location. We then designed a certainty heuristic to deem a StrokeClassifier diagnosis as confidently non-cryptogenic by the degree of consensus among the 9 classifiers, and applied it to 788 cryptogenic patients. This reduced the percentage of the cryptogenic strokes from 25.2-7.2% of all ischemic strokes. StrokeClassifier is a validated artificial intelligence tool that rivals the performance of vascular neurologists in classifying ischemic stroke etiology for individual patients. With further training, StrokeClassifier may have downstream applications including its use as a clinical decision support system.

publication date

  • October 31, 2023

Identity

PubMed Central ID

  • PMC10635373

Digital Object Identifier (DOI)

  • 10.21203/rs.3.rs-3367169/v1

PubMed ID

  • 37961532