10 research outputs found

    Automatic and standardized quality assurance of digital mammography and tomosynthesis with deep convolutional neural networks

    Full text link
    OBJECTIVES The aim of this study was to develop and validate a commercially available AI platform for the automatic determination of image quality in mammography and tomosynthesis considering a standardized set of features. MATERIALS AND METHODS In this retrospective study, 11,733 mammograms and synthetic 2D reconstructions from tomosynthesis of 4200 patients from two institutions were analyzed by assessing the presence of seven features which impact image quality in regard to breast positioning. Deep learning was applied to train five dCNN models on features detecting the presence of anatomical landmarks and three dCNN models for localization features. The validity of models was assessed by the calculation of the mean squared error in a test dataset and was compared to the reading by experienced radiologists. RESULTS Accuracies of the dCNN models ranged between 93.0% for the nipple visualization and 98.5% for the depiction of the pectoralis muscle in the CC view. Calculations based on regression models allow for precise measurements of distances and angles of breast positioning on mammograms and synthetic 2D reconstructions from tomosynthesis. All models showed almost perfect agreement compared to human reading with Cohen's kappa scores above 0.9. CONCLUSIONS An AI-based quality assessment system using a dCNN allows for precise, consistent and observer-independent rating of digital mammography and synthetic 2D reconstructions from tomosynthesis. Automation and standardization of quality assessment enable real-time feedback to technicians and radiologists that shall reduce a number of inadequate examinations according to PGMI (Perfect, Good, Moderate, Inadequate) criteria, reduce a number of recalls and provide a dependable training platform for inexperienced technicians

    Data extraction sheet

    No full text

    Time Is Money: Considerations for Measuring the Radiological Reading Time

    No full text
    Timestamps in the Radiology Information System (RIS) are a readily available and valuable source of information with increasing significance, among others, due to the current focus on the clinical impact of artificial intelligence applications. We aimed to evaluate timestamp-based radiological dictation time, introduce timestamp modeling techniques, and compare those with prospective measured reporting. Dictation time was calculated from RIS timestamps between 05/2010 and 01/2021 at our institution (n = 108,310). We minimized contextual outliers by simulating the raw data by iteration (1000, vector size (µ/sd/λ) = 100/loop), assuming normally distributed reporting times. In addition, 329 reporting times were prospectively measured by two radiologists (1 and 4 years of experience). Altogether, 106,127 of 108,310 exams were included after simulation, with a mean dictation time of 16.62 min. Mean dictation time was 16.05 min head CT (44,743/45,596), 15.84 min for chest CT (32,797/33,381), 17.92 min for abdominal CT (n = 22,805/23,483), 10.96 min for CT foot (n = 937/958), 9.14 min for lumbar spine (881/892), 8.83 min for shoulder (409/436), 8.83 min for CT wrist (1201/1322), and 39.20 min for a polytrauma patient (2127/2242), without a significant difference to the prospective reporting times. In conclusion, timestamp analysis is useful to measure current reporting practice, whereas body-region and radiological experience are confounders. This could aid in cost–benefit assessments of workflow changes (e.g., AI implementation)

    Considerations on Baseline Generation for Imaging AI Studies Illustrated on the CT-Based Prediction of Empyema and Outcome Assessment

    No full text
    For AI-based classification tasks in computed tomography (CT), a reference standard for evaluating the clinical diagnostic accuracy of individual classes is essential. To enable the implementation of an AI tool in clinical practice, the raw data should be drawn from clinical routine data using state-of-the-art scanners, evaluated in a blinded manner and verified with a reference test. Three hundred and thirty-five consecutive CTs, performed between 1 January 2016 and 1 January 2021 with reported pleural effusion and pathology reports from thoracocentesis or biopsy within 7 days of the CT were retrospectively included. Two radiologists (4 and 10 PGY) blindly assessed the chest CTs for pleural CT features. If needed, consensus was achieved using an experienced radiologist’s opinion (29 PGY). In addition, diagnoses were extracted from written radiological reports. We analyzed these findings for a possible correlation with the following patient outcomes: mortality and median hospital stay. For AI prediction, we used an approach consisting of nnU-Net segmentation, PyRadiomics features and a random forest model. Specificity and sensitivity for CT-based detection of empyema (n = 81 of n = 335 patients) were 90.94 (95%-CI: 86.55–94.05) and 72.84 (95%-CI: 61.63–81.85%) in all effusions, with moderate to almost perfect interrater agreement for all pleural findings associated with empyema (Cohen’s kappa = 0.41–0.82). Highest accuracies were found for pleural enhancement or thickening with 87.02% and 81.49%, respectively. For empyema prediction, AI achieved a specificity and sensitivity of 74.41% (95% CI: 68.50–79.57) and 77.78% (95% CI: 66.91–85.96), respectively. Empyema was associated with a longer hospital stay (median = 20 versus 14 days), and findings consistent with pleural carcinomatosis impacted mortality

    Diagnostic Accuracy of Imaging Findings in Pleural Empyema: Systematic Review and Meta-Analysis

    No full text
    Computed tomography (CT) diagnosis of empyema is challenging because current literature features multiple overlapping pleural findings. We aimed to identify informative findings for structured reporting. The screening according to inclusion criteria (P: Pleural empyema, I: CT C: culture/gram-stain/pathology/pus, O: Diagnostic accuracy measures), data extraction, and risk of bias assessment of studies published between 01-1980 and 10-2021 on Pubmed, Embase, and Web of Science (WOS) were performed independently by two reviewers. CT findings with pooled diagnostic odds ratios (DOR) with 95% confidence intervals, not including 1, were considered as informative. Summary estimates of diagnostic accuracy for CT findings were calculated by using a bivariate random-effects model and heterogeneity sources were evaluated. Ten studies with a total of 252 patients with and 846 without empyema were included. From 119 overlapping descriptors, five informative CT findings were identified: Pleural enhancement, thickening, loculation, fat thickening, and fat stranding with an AUC of 0.80 (hierarchical summary receiver operating characteristic, HSROC). Potential sources of heterogeneity were different thresholds, empyema prevalence, and study year

    Towards More Structure: Comparing TNM Staging Completeness and Processing Time of Text-Based Reports versus Fully Segmented and Annotated PET/CT Data of Non-Small-Cell Lung Cancer

    No full text
    Results of PET/CT examinations are communicated as text-based reports which are frequently not fully structured. Incomplete or missing staging information can be a significant source of staging and treatment errors. We compared standard text-based reports to a manual full 3D-segmentation-based approach with respect to TNM completeness and processing time. TNM information was extracted retrospectively from 395 reports. Moreover, the RIS time stamps of these reports were analyzed. 2995 lesions using a set of 41 classification labels (TNM features + location) were manually segmented on the corresponding image data. Information content and processing time of reports and segmentations were compared using descriptive statistics and modelling. The TNM/UICC stage was mentioned explicitly in only 6% (n=22) of the text-based reports. In 22% (n=86), information was incomplete, most frequently affecting T stage (19%, n=74), followed by N stage (6%, n=22) and M stage (2%, n=9). Full NSCLC-lesion segmentation required a median time of 13.3 min, while the median of the shortest estimator of the text-based reporting time (R1) was 18.1 min (p=0.01). Tumor stage (UICC I/II: 5.2 min, UICC III/IV: 20.3 min, p<0.001), lesion size (p<0.001), and lesion count (n=1: 4.4 min, n=12: 37.2 min, p<0.001) correlated significantly with the segmentation time, but not with the estimators of text-based reporting time. Numerous text-based reports are lacking staging information. A segmentation-based reporting approach tailored to the staging task improves report quality with manageable processing time and helps to avoid erroneous therapy decisions based on incomplete reports. Furthermore, segmented data may be used for multimedia enhancement and automatization

    Automated Detection, Segmentation, and Classification of Pericardial Effusions on Chest CT Using a Deep Convolutional Neural Network

    No full text
    Pericardial effusions (PEFs) are often missed on Computed Tomography (CT), which particularly affects the outcome of patients presenting with hemodynamic compromise. An automatic PEF detection, segmentation, and classification tool would expedite and improve CT based PEF diagnosis; 258 CTs with (206 with simple PEF, 52 with hemopericardium) and without PEF (each 134 with contrast, 124 non-enhanced) were identified using the radiology report (01/2016–01/2021). PEF were manually 3D-segmented. A deep convolutional neural network (nnU-Net) was trained on 316 cases and separately tested on the remaining 200 and 22 external post-mortem CTs. Inter-reader variability was tested on 40 CTs. PEF classification utilized the median Hounsfield unit from each prediction. The sensitivity and specificity for PEF detection was 97% (95% CI 91.48–99.38%) and 100.00% (95% CI 96.38–100.00%) and 89.74% and 83.61% for diagnosing hemopericardium (AUC 0.944, 95% CI 0.904–0.984). Model performance (Dice coefficient: 0.75 ± 0.01) was non-inferior to inter-reader (0.69 ± 0.02) and was unaffected by contrast administration nor alternative chest pathology (p > 0.05). External dataset testing yielded similar results. Our model reliably detects, segments, and classifies PEF on CT in a complex dataset, potentially serving as an alert tool whilst enhancing report quality. The model and corresponding datasets are publicly available
    corecore