11 research outputs found

    Automated Labeling of German Chest X-Ray Radiology Reports using Deep Learning

    Full text link
    Radiologists are in short supply globally, and deep learning models offer a promising solution to address this shortage as part of clinical decision-support systems. However, training such models often requires expensive and time-consuming manual labeling of large datasets. Automatic label extraction from radiology reports can reduce the time required to obtain labeled datasets, but this task is challenging due to semantically similar words and missing annotated data. In this work, we explore the potential of weak supervision of a deep learning-based label prediction model, using a rule-based labeler. We propose a deep learning-based CheXpert label prediction model, pre-trained on reports labeled by a rule-based German CheXpert model and fine-tuned on a small dataset of manually labeled reports. Our results demonstrate the effectiveness of our approach, which significantly outperformed the rule-based model on all three tasks. Our findings highlight the benefits of employing deep learning-based models even in scenarios with sparse data and the use of the rule-based labeler as a tool for weak supervision

    Evaluation of patients with respiratory infections during the first pandemic wave in Germany: characteristics of COVID-19 versus non-COVID-19 patients

    Get PDF
    BACKGROUND Characteristics of COVID-19 patients have mainly been reported within confirmed COVID-19 cohorts. By analyzing patients with respiratory infections in the emergency department during the first pandemic wave, we aim to assess differences in the characteristics of COVID-19 vs. Non-COVID-19 patients. This is particularly important regarding the second COVID-19 wave and the approaching influenza season. METHODS We prospectively included 219 patients with suspected COVID-19 who received radiological imaging and RT-PCR for SARS-CoV-2. Demographic, clinical and laboratory parameters as well as RT-PCR results were used for subgroup analysis. Imaging data were reassessed using the following scoring system: 0 - not typical, 1 - possible, 2 - highly suspicious for COVID-19. RESULTS COVID-19 was diagnosed in 72 (32,9%) patients. In three of them (4,2%) the initial RT-PCR was negative while initial CT scan revealed pneumonic findings. 111 (50,7%) patients, 61 of them (55,0%) COVID-19 positive, had evidence of pneumonia. Patients with COVID-19 pneumonia showed higher body temperature (37,7~± 0,1 vs. 37,1~± 0,1 °C; p = 0.0001) and LDH values (386,3~± 27,1 vs. 310,4~± 17,5 U/l; p = 0.012) as well as lower leukocytes (7,6~± 0,5 vs. 10,1~± 0,6G/l; p = 0.0003) than patients with other pneumonia. Among abnormal CT findings in COVID-19 patients, 57 (93,4%) were evaluated as highly suspicious or possible for COVID-19. In patients with negative RT-PCR and pneumonia, another third was evaluated as highly suspicious or possible for COVID-19 (14 out of 50; 28,0%). The sensitivity in the detection of patients requiring isolation was higher with initial chest CT than with initial RT-PCR (90,4% vs. 79,5%). CONCLUSIONS COVID-19 patients show typical clinical, laboratory and imaging parameters which enable a sensitive detection of patients who demand isolation measures due to COVID-19

    Pneumothorax detection in chest radiographs: optimizing artificial intelligence system for accuracy and confounding bias reduction using in-image annotations in algorithm training

    Get PDF
    OBJECTIVES Diagnostic accuracy of artificial intelligence (AI) pneumothorax (PTX) detection in chest radiographs (CXR) is limited by the noisy annotation quality of public training data and confounding thoracic tubes (TT). We hypothesize that in-image annotations of the dehiscent visceral pleura for algorithm training boosts algorithm's performance and suppresses confounders. METHODS Our single-center evaluation cohort of 3062 supine CXRs includes 760 PTX-positive cases with radiological annotations of PTX size and inserted TTs. Three step-by-step improved algorithms (differing in algorithm architecture, training data from public datasets/clinical sites, and in-image annotations included in algorithm training) were characterized by area under the receiver operating characteristics (AUROC) in detailed subgroup analyses and referenced to the well-established \textquotedblCheXNet\textquotedbl algorithm. RESULTS Performances of established algorithms exclusively trained on publicly available data without in-image annotations are limited to AUROCs of 0.778 and strongly biased towards TTs that can completely eliminate algorithm's discriminative power in individual subgroups. Contrarily, our final \textquotedblalgorithm 2\textquotedbl which was trained on a lower number of images but additionally with in-image annotations of the dehiscent pleura achieved an overall AUROC of 0.877 for unilateral PTX detection with a significantly reduced TT-related confounding bias. CONCLUSIONS We demonstrated strong limitations of an established PTX-detecting AI algorithm that can be significantly reduced by designing an AI system capable of learning to both classify and localize PTX. Our results are aimed at drawing attention to the necessity of high-quality in-image localization in training data to reduce the risks of unintentionally biasing the training process of pathology-detecting AI algorithms. KEY POINTS • Established pneumothorax-detecting artificial intelligence algorithms trained on public training data are strongly limited and biased by confounding thoracic tubes. • We used high-quality in-image annotated training data to effectively boost algorithm performance and suppress the impact of confounding thoracic tubes. • Based on our results, we hypothesize that even hidden confounders might be effectively addressed by in-image annotations of pathology-related image features

    Clinically focused multi-cohort benchmarking as a tool for external validation of artificial intelligence algorithm performance in basic chest radiography analysis

    Get PDF
    Artificial intelligence (AI) algorithms evaluating [supine] chest radiographs ([S]CXRs) have remarkably increased in number recently. Since training and validation are often performed on subsets of the same overall dataset, external validation is mandatory to reproduce results and reveal potential training errors. We applied a multicohort benchmarking to the publicly accessible (S)CXR analyzing AI algorithm CheXNet, comprising three clinically relevant study cohorts which differ in patient positioning ([S]CXRs), the applied reference standards (CT-/[S]CXR-based) and the possibility to also compare algorithm classification with different medical experts’ reading performance. The study cohorts include [1] a cohort, characterized by 563 CXRs acquired in the emergency unit that were evaluated by 9 readers (radiologists and non-radiologists) in terms of 4 common pathologies, [2] a collection of 6,248 SCXRs annotated by radiologists in terms of pneumothorax presence, its size and presence of inserted thoracic tube material which allowed for subgroup and confounding bias analysis and [3] a cohort consisting of 166 patients with SCXRs that were evaluated by radiologists for underlying causes of basal lung opacities, all of those cases having been correlated to a timely acquired computed tomography scan (SCXR and CT within < 90 min). CheXNet non-significantly exceeded the radiology resident (RR) consensus in the detection of suspicious lung nodules (cohort [1], AUC AI/RR: 0.851/0.839, p = 0.793) and the radiological readers in the detection of basal pneumonia (cohort [3], AUC AI/reader consensus: 0.825/0.782, p = 0.390) and basal pleural effusion (cohort [3], AUC AI/reader consensus: 0.762/0.710, p = 0.336) in SCXR, partly with AUC values higher than originally published (“Nodule”: 0.780, “Infiltration”: 0.735, “Effusion”: 0.864). The classifier “Infiltration” turned out to be very dependent on patient positioning (best in CXR, worst in SCXR). The pneumothorax SCXR cohort [2] revealed poor algorithm performance in CXRs without inserted thoracic material and in the detection of small pneumothoraces, which can be explained by a known systematic confounding error in the algorithm training process. The benefit of clinically relevant external validation is demonstrated by the differences in algorithm performance as compared to the original publication. Our multi-cohort benchmarking finally enables the consideration of confounders, different reference standards and patient positioning as well as the AI performance comparison with differentially qualified medical readers

    COVID-19 Pandemic and Upcoming Influenza Season—Does an Expert’s Computed Tomography Assessment Differentially Identify COVID-19, Influenza and Pneumonias of Other Origin?

    No full text
    (1) Background: Time-consuming SARS-CoV-2 RT-PCR suffers from limited sensitivity in early infection stages whereas fast available chest CT can already raise COVID-19 suspicion. Nevertheless, radiologists&rsquo; performance to differentiate COVID-19, especially from influenza pneumonia, is not sufficiently characterized. (2) Methods: A total of 201 pneumonia CTs were identified and divided into subgroups based on RT-PCR: 78 COVID-19 CTs, 65 influenza CTs and 62 Non-COVID-19-Non-influenza (NCNI) CTs. Three radiology experts (blinded from RT-PCR results) raised pathogen-specific suspicion (separately for COVID-19, influenza, bacterial pneumonia and fungal pneumonia) according to the following reading scores: 0&mdash;not typical/1&mdash;possible/2&mdash;highly suspected. Diagnostic performances were calculated with RT-PCR as a reference standard. Dependencies of radiologists&rsquo; pathogen suspicion scores were characterized by Pearson&rsquo;s Chi2 Test for Independence. (3) Results: Depending on whether the intermediate reading score 1 was considered as positive or negative, radiologists correctly classified 83&ndash;85% (vs. NCNI)/79&ndash;82% (vs. influenza) of COVID-19 cases (sensitivity up to 94%). Contrarily, radiologists correctly classified only 52&ndash;56% (vs. NCNI)/50&ndash;60% (vs. COVID-19) of influenza cases. The COVID-19 scoring was more specific than the influenza scoring compared with suspected bacterial or fungal infection. (4) Conclusions: High-accuracy COVID-19 detection by CT might expedite patient management even during the upcoming influenza season

    Advanced Fusion Imaging and Contrast-Enhanced Imaging (CT/MRI-CEUS) in Oncology

    No full text
    Simple Summary Fusion imaging depicts an innovative technique by which previously performed computed tomography/magnetic resonance imaging can be integrated and reconstructed with advanced contrast-enhanced ultrasound using modern ultrasound devices in a real-time manner. Fusion imaging allows for complementing strengths and reducing restrictions of the combined imaging modalities. The visualization of parenchymal and tumoral microperfusion by contrast-enhanced ultrasound can be dynamically fused and assessed with images from previous cross-sectional studies and may help to decipher underlying entities of indeterminate lesions or validate suspicious morphology. The findings from our study demonstrate the benefits of fusion imaging for evaluating focal hepatic and renal lesions. The excellent safety profile, accessibility, repeatability and cost-effectiveness are advantages of fusion imaging which make it a powerful diagnostic tool for the modern radiologist. Fusion imaging depicts an innovative technique that facilitates combining assets and reducing restrictions of advanced ultrasound and cross-sectional imaging. The purpose of the present retrospective study was to evaluate the role of fusion imaging for assessing hepatic and renal lesions. Between 02/2011-08/2020, 92 patients in total were included in the study, of which 32 patients had hepatic lesions, 60 patients had renal lesions. Fusion imaging was technically successful in all patients. No adverse side effects upon intravenous (i.v.) application of SonoVue (R) (Bracco, Milan, Italy) were registered. Fusion imaging could clarify all 11 (100%) initially as indeterminate described hepatic lesions by computed tomography/magnetic resonance imaging (CT/MRI). Moreover, 5/14 (36%) initially suspicious hepatic lesions could be validated by fusion imaging, whereas in 8/14 (57%), malignant morphology was disproved. Moreover, fusion imaging allowed for the clarification of 29/30 (97%) renal lesions initially characterized as suspicious by CT/MRI, of which 19/30 (63%) underwent renal surgery, histopathology revealed malignancy in 16/19 (84%), and benignity in 3/19 (16%). Indeterminate findings could be elucidated by fusion imaging in 20/20 (100%) renal lesions. Its accessibility and repeatability, even during pregnancy and in childhood, its cost-effectiveness, and its excellent safety profile, make fusion imaging a promising instrument for the thorough evaluation of hepatic and renal lesions in the future

    Prognostic Value of Admission Chest CT Findings for Invasive Ventilation Therapy in COVID-19 Pneumonia

    No full text
    (1) Background: To assess the value of chest CT imaging features of COVID-19 disease upon hospital admission for risk stratification of invasive ventilation (IV) versus no or non-invasive ventilation (non-IV) during hospital stay. (2) Methods: A retrospective single-center study was conducted including all patients admitted during the first three months of the pandemic at our hospital with PCR-confirmed COVID-19 disease and admission chest CT scans (n = 69). Using clinical information and CT imaging features, a 10-point ordinal risk score was developed and its diagnostic potential to differentiate a severe (IV-group) from a more moderate course (non-IV-group) of the disease was tested. (3) Results: Frequent imaging findings of COVID-19 pneumonia in both groups were ground glass opacities (91.3%), consolidations (53.6%) and crazy paving patterns (31.9%). Characteristics of later stages such as subpleural bands were observed significantly more often in the IV-group (52.2% versus 26.1%, p = 0.032). Using information directly accessible during a radiologist’s reporting, a simple risk score proved to reliably differentiate between IV- and non-IV-groups (AUC: 0.89 (95% CI 0.81–0.96), p &lt; 0.001). (4) Conclusions: Information accessible from admission CT scans can effectively and reliably be used in a scoring model to support risk stratification of COVID-19 patients to improve resource and allocation management of hospitals

    Bi-Centric Independent Validation of Outcome Prediction after Radioembolization of Primary and Secondary Liver Cancer

    No full text
    Background: Yttrium-90 radioembolization (RE) plays an important role in the treatment of liver malignancies. Optimal patient selection is crucial for an effective and safe treatment. In this study, we aim to validate the prognostic performance of a previously established random survival forest (RSF) with an external validation cohort from a different national center. Furthermore, we compare outcome prediction models with different established metrics. Methods: A previously established RSF model, trained on a consecutive cohort of 366 patients who had received RE due to primary or secondary liver tumor at a national center (center 1), was used to predict the outcome of an independent consecutive cohort of 202 patients from a different national center (center 2) and vice versa. Prognostic performance was evaluated using the concordance index (C-index) and the integrated Brier score (IBS). The prognostic importance of designated baseline parameters was measured with the minimal depth concept, and the influence on the predicted outcome was analyzed with accumulated local effects plots. RSF values were compared to conventional cox proportional hazards models in terms of C-index and IBS. Results: The established RSF model achieved a C-index of 0.67 for center 2, comparable to the results obtained for center 1, which it was trained on (0.66). The RSF model trained on center 2 achieved a C-index of 0.68 on center 2 data and 0.66 on center 1 data. CPH models showed comparable results on both cohorts, with C-index ranging from 0.68 to 0.72. IBS validation showed more differentiated results depending on which cohort was trained on and which cohort was predicted (range: 0.08 to 0.20). Baseline cholinesterase was the most important variable for survival prediction. Conclusion: The previously developed predictive RSF model was successfully validated with an independent external cohort. C-index and IBS are suitable metrics to compare outcome prediction models, with IBS showing more differentiated results. The findings corroborate that survival after RE is critically determined by functional hepatic reserve and thus baseline liver function should play a key role in patient selection

    Impact of Pharmaceutical Prophylaxis on Radiation-Induced Liver Disease Following Radioembolization

    No full text
    Background: Radioembolization (RE) with yttrium-90 (90Y) resin microspheres yields heterogeneous response rates in with primary or secondary liver cancer. Radiation-induced liver disease (RILD) is a potentially life-threatening complication with higher prevalence in cirrhotics or patients exposed to previous chemotherapies. Advances in RILD prevention may help increasing tolerable radiation doses to improve patient outcomes. This study aimed to evaluate the impact of post-therapeutic RILD-prophylaxis in a cohort of intensely pretreated liver metastatic breast cancer patients; Methods: Ninety-three patients with liver metastases of breast cancer received RE between 2007 and 2016. All Patients received RILD prophylaxis for 8 weeks post-RE. From January 2014, RILD prophylaxis was changed from ursodeoxycholic acid (UDCA) and prednisolone (standard prophylaxis [SP]; n = 59) to pentoxifylline (PTX), UDCA and low-dose low molecular weight heparin (LMWH) (modified prophylaxis (MP); n = 34). The primary endpoint was toxicity including symptoms of RILD; Results: Dose exposure of normal liver parenchyma was higher in the modified vs. standard prophylaxis group (47.2 Gy (17.8–86.8) vs. 40.2 Gy (12.5–83.5), p = 0.017). All grade RILD events (mild: bilirubin ≥ 21 µmol/L (but &lt;30 μmol/L); severe: (bilirubin ≥ 30 µmol/L and ascites)) were observed more frequently in the SP group than in the MP group, albeit without significance (7/59 vs. 1/34; p = 0.140). Severe RILD occurred in the SP group only (n = 2; p &gt; 0.1). ALBI grade increased in 16.7% patients in the MP and in 27.1% patients in the SP group, respectively (group difference not significant); Conclusions: At established dose levels, mild or severe RILD events proved rare in our cohort. RILD prophylaxis with PTX, UDCA and LMWH appears to have an independent positive impact on OS in patients with metastatic breast cancer and may reduce the frequency and severity of RILD. Results of this study as well as pathophysiological considerations warrant further investigations of RILD prophylaxis presumably targeting combinations of anticoagulation (MP) and antiinflammation (SP) to increase dose prescriptions in radioembolization

    Decomposing Acute Symptom Severity in Large Vessel Occlusion Stroke: Association With Multiparametric CT Imaging and Clinical Parameters

    Get PDF
    Background and Purpose: Acute ischemic stroke of the anterior circulation due to large vessel occlusion (LVO) is a multifactorial process, which causes neurologic symptoms of different degree. Our aim was to examine the impact of neuromorphologic and vascular correlates as well as clinical factors on acute symptom severity in LVO stroke. Methods: We selected LVO stroke patients with known onset time from a consecutive cohort which underwent multiparametric CT including non-contrast CT, CT angiography and CT perfusion (CTP) before thrombectomy. Software-based quantification was used to calculate CTP total ischemic and ischemic core volume. Symptom severity was assessed using the National Institutes of Health Stroke Scale (NIHSS) upon admission. Multivariable regression analysis was performed to determine independent associations of admission NIHSS with imaging and clinical parameters. Receiver operating characteristics (ROC) analyses were used to examine performance of imaging parameters to classify symptom severity. Results: We included 142 patients. Linear and ordinal regression analyses for NIHSS and NIHSS severity groups identified significant associations for total ischemic volume [β = 0.31, p = 0.01; Odds ratio (OR) = 1.11, 95%-confidence-interval (CI): 1.02–1.19], clot burden score (β = −0.28, p = 0.01; OR = 0.76, 95%-CI: 0.64–0.90) and age (β = 0.17, p = 0.04). No association was found for ischemic core volume, stroke side, collaterals and time from onset. Stroke topography according to the Alberta Stroke Program CT Score template did not display significant influence after correction for multiple comparisons. AUC for classification of the NIHSS threshold ≥6 by total ischemic volume was 0.81 (p < 0.001). Conclusions: We determined total ischemic volume, clot burden and age as relevant drivers for baseline NIHSS in acute LVO stroke. This suggests that not only mere volume but also degree of occlusion influences symptom severity. Use of imaging parameters as surrogate for baseline NIHSS reached limited performance underlining the need for combined clinical and imaging assessment in acute stroke management
    corecore