3,162 research outputs found
PadChest: A large chest x-ray image dataset with multi-label annotated reports
We present a labeled large-scale, high resolution chest x-ray dataset for the
automated exploration of medical images along with their associated reports.
This dataset includes more than 160,000 images obtained from 67,000 patients
that were interpreted and reported by radiologists at Hospital San Juan
Hospital (Spain) from 2009 to 2017, covering six different position views and
additional information on image acquisition and patient demography. The reports
were labeled with 174 different radiographic findings, 19 differential
diagnoses and 104 anatomic locations organized as a hierarchical taxonomy and
mapped onto standard Unified Medical Language System (UMLS) terminology. Of
these reports, 27% were manually annotated by trained physicians and the
remaining set was labeled using a supervised method based on a recurrent neural
network with attention mechanisms. The labels generated were then validated in
an independent test set achieving a 0.93 Micro-F1 score. To the best of our
knowledge, this is one of the largest public chest x-ray database suitable for
training supervised models concerning radiographs, and the first to contain
radiographic reports in Spanish. The PadChest dataset can be downloaded from
http://bimcv.cipf.es/bimcv-projects/padchest/
Recommended from our members
Using the Electronic Medical Record to Identify Community-Acquired Pneumonia: Toward a Replicable Automated Strategy
Background: Timely information about disease severity can be central to the detection and management of outbreaks of acute respiratory infections (ARI), including influenza. We asked if two resources: 1) free text, and 2) structured data from an electronic medical record (EMR) could complement each other to identify patients with pneumonia, an ARI severity landmark. Methods: A manual EMR review of 2747 outpatient ARI visits with associated chest imaging identified x-ray reports that could support the diagnosis of pneumonia (kappa score = 0.88 (95% CI 0.82∶0.93)), along with attendant cases with Possible Pneumonia (adds either cough, sputum, fever/chills/night sweats, dyspnea or pleuritic chest pain) or with Pneumonia-in-Plan (adds pneumonia stated as a likely diagnosis by the provider). The x-ray reports served as a reference to develop a text classifier using machine-learning software that did not require custom coding. To identify pneumonia cases, the classifier was combined with EMR-based structured data and with text analyses aimed at ARI symptoms in clinical notes. Results: 370 reference cases with Possible Pneumonia and 250 with Pneumonia-in-Plan were identified. The x-ray report text classifier increased the positive predictive value of otherwise identical EMR-based case-detection algorithms by 20–70%, while retaining sensitivities of 58–75%. These performance gains were independent of the case definitions and of whether patients were admitted to the hospital or sent home. Text analyses seeking ARI symptoms in clinical notes did not add further value. Conclusion: Specialized software development is not required for automated text analyses to help identify pneumonia patients. These results begin to map an efficient, replicable strategy through which EMR data can be used to stratify ARI severity
- …