Urmila Chandran1 PhD, Timothy Wolfe1 MBA, Christine Chen1 MD, Daniel Riskin1 MD, FACS
Background
The vast majority of real-world evidence (RWE) generated in recent years involves use of solely structured real-world data (RWD), such as structured electronic health records (EHR) or administrative claims. The ability to obtain clinically nuanced data such as disease subtypes and symptoms with high accuracy may be severely limited in structured RWD for diseases such as migraine. There may be systematic errors and lack of scientific rigor in migraine RWE if key variables for a research question are not extracted with the required clinical accuracy and granularity.
Objectives
To assess the accuracy and feasibility of extracting migraine-related features and symptoms using advanced technologies applied to unstructured EHR data.
Methods
Study design: The study was a retrospective analysis of EHR data of patients treated at an integrated delivery health system in the US. The advanced approach as detailed below was compared to a manual reference standard. A total of 18 pre-specified migraine-related concepts (including symptoms and subtypes) were annotated.
Analysis
Unstructured EHR data from 2,750 EHR encounters were obtained. The reference standard was randomly split into training (1,000), validation (500), and test (1,250) sets. Recall, precision, and F1-score were calculated for all features with at least 20 patient-experienced occurrences in the reference standard test set. Recall (sensitivity) was the percent of data elements identified by manual annotation that were also identified by the advanced approach. Precision (positive predictive value) was the percent of data elements identified by the advanced approach that were also identified by manual annotation. The F1-score was used as a summary measure of precision and recall. The F1-score, calculated as 2 x ([precision x recall]/ [precision + recall]), is the weighted harmonic mean of precision and recall. Inter-related reliability was >0.85 reflecting a credible reference standard.
Results
The average recall, precision, and F1-score across all evaluable concepts were 91.5%, 93.4%, and 92.1%, respectively with the advanced approach compared to the reference standard. Except for phonophobia, the F1-score for all concepts was greater than 80% with the advanced approach.
Conclusion
Measurement of accuracy is an important step towards meeting data reliability standards for migraine RWE. The ability to capture clinically important migraine features and phenotypes by leveraging advanced technologies and narrative content in EHRs paves the way for scientifically robust RWE for migraine management.
Presented at ISPOR 2024, May 6, 2024
1Verantos Inc., 325 Sharon Park Dr., Menlo Park, CA 94025