4 research outputs found
American Family Cohort, a data resource description
This manuscript is a research resource description and presents a large and
novel Electronic Health Records (EHR) data resource, American Family Cohort
(AFC). The AFC data is derived from Centers for Medicare and Medicaid Services
(CMS) certified American Board of Family Medicine (ABFM) PRIME registry. The
PRIME registry is the largest national Qualified Clinical Data Registry (QCDR)
for Primary Care. The data is converted to a popular common data model, the
Observational Health Data Sciences and Informatics (OHDSI) Observational
Medical Outcomes Partnership (OMOP) Common Data Model (CDM).
The resource presents approximately 90 million encounters for 7.5 million
patients. All 100% of the patients present age, gender, and address
information, and 73% report race. Nealy 93% of patients have lab data in LOINC,
86% have medication data in RxNorm, 93% have diagnosis in SNOWMED and ICD, 81%
have procedures in HCPCS or CPT, and 61% have insurance information. The
richness, breadth, and diversity of this research accessible and research ready
data is expected to accelerate observational studies in many diverse areas. We
expect this resource to facilitate research in many years to come
Recommended from our members
Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson’s Natural Language Processing Algorithm
Magnetic resonance imaging (MRI) protocoling can be time- and resource-intensive, and protocols can often be suboptimal dependent upon the expertise or preferences of the protocoling radiologist. Providing a best-practice recommendation for an MRI protocol has the potential to improve efficiency and decrease the likelihood of a suboptimal or erroneous study. The goal of this study was to develop and validate a machine learning-based natural language classifier that can automatically assign the use of intravenous contrast for musculoskeletal MRI protocols based upon the free-text clinical indication of the study, thereby improving efficiency of the protocoling radiologist and potentially decreasing errors. We utilized a deep learning-based natural language classification system from IBM Watson, a question-answering supercomputer that gained fame after challenging the best human players on Jeopardy! in 2011. We compared this solution to a series of traditional machine learning-based natural language processing techniques that utilize a term-document frequency matrix. Each classifier was trained with 1240 MRI protocols plus their respective clinical indications and validated with a test set of 280. Ground truth of contrast assignment was obtained from the clinical record. For evaluation of inter-reader agreement, a blinded second reader radiologist analyzed all cases and determined contrast assignment based on only the free-text clinical indication. In the test set, Watson demonstrated overall accuracy of 83.2% when compared to the original protocol. This was similar to the overall accuracy of 80.2% achieved by an ensemble of eight traditional machine learning algorithms based on a term-document matrix. When compared to the second reader's contrast assignment, Watson achieved 88.6% agreement. When evaluating only the subset of cases where the original protocol and second reader were concordant (n = 251), agreement climbed further to 90.0%. The classifier was relatively robust to spelling and grammatical errors, which were frequent. Implementation of this automated MR contrast determination system as a clinical decision support tool may save considerable time and effort of the radiologist while potentially decreasing error rates, and require no change in order entry or workflow