21 research outputs found
Examination of fully automated mammographic density measures using LIBRA and breast cancer risk in a cohort of 21,000 non-Hispanic white women
BACKGROUND: Breast density is strongly associated with breast cancer risk. Fully automated quantitative density assessment methods have recently been developed that could facilitate large-scale studies, although data on associations with long-term breast cancer risk are limited. We examined LIBRA assessments and breast cancer risk and compared results to prior assessments using Cumulus, an established computer-assisted method requiring manual thresholding.
METHODS: We conducted a cohort study among 21,150 non-Hispanic white female participants of the Research Program in Genes, Environment and Health of Kaiser Permanente Northern California who were 40-74 years at enrollment, followed for up to 10 years, and had archived processed screening mammograms acquired on Hologic or General Electric full-field digital mammography (FFDM) machines and prior Cumulus density assessments available for analysis. Dense area (DA), non-dense area (NDA), and percent density (PD) were assessed using LIBRA software. Cox regression was used to estimate hazard ratios (HRs) for breast cancer associated with DA, NDA and PD modeled continuously in standard deviation (SD) increments, adjusting for age, mammogram year, body mass index, parity, first-degree family history of breast cancer, and menopausal hormone use. We also examined differences by machine type and breast view.
RESULTS: The adjusted HRs for breast cancer associated with each SD increment of DA, NDA and PD were 1.36 (95% confidence interval, 1.18-1.57), 0.85 (0.77-0.93) and 1.44 (1.26-1.66) for LIBRA and 1.44 (1.33-1.55), 0.81 (0.74-0.89) and 1.54 (1.34-1.77) for Cumulus, respectively. LIBRA results were generally similar by machine type and breast view, although associations were strongest for Hologic machines and mediolateral oblique views. Results were also similar during the first 2 years, 2-5 years and 5-10 years after the baseline mammogram.
CONCLUSION: Associations with breast cancer risk were generally similar for LIBRA and Cumulus density measures and were sustained for up to 10 years. These findings support the suitability of fully automated LIBRA assessments on processed FFDM images for large-scale research on breast density and cancer risk
Artificial intelligence in mammographic phenotyping of breast cancer risk: A narrative review
BACKGROUND: Improved breast cancer risk assessment models are needed to enable personalized screening strategies that achieve better harm-to-benefit ratio based on earlier detection and better breast cancer outcomes than existing screening guidelines. Computational mammographic phenotypes have demonstrated a promising role in breast cancer risk prediction. With the recent exponential growth of computational efficiency, the artificial intelligence (AI) revolution, driven by the introduction of deep learning, has expanded the utility of imaging in predictive models. Consequently, AI-based imaging-derived data has led to some of the most promising tools for precision breast cancer screening.
MAIN BODY: This review aims to synthesize the current state-of-the-art applications of AI in mammographic phenotyping of breast cancer risk. We discuss the fundamentals of AI and explore the computing advancements that have made AI-based image analysis essential in refining breast cancer risk assessment. Specifically, we discuss the use of data derived from digital mammography as well as digital breast tomosynthesis. Different aspects of breast cancer risk assessment are targeted including (a) robust and reproducible evaluations of breast density, a well-established breast cancer risk factor, (b) assessment of a woman\u27s inherent breast cancer risk, and (c) identification of women who are likely to be diagnosed with breast cancers after a negative or routine screen due to masking or the rapid and aggressive growth of a tumor. Lastly, we discuss AI challenges unique to the computational analysis of mammographic imaging as well as future directions for this promising research field.
CONCLUSIONS: We provide a useful reference for AI researchers investigating image-based breast cancer risk assessment while indicating key priorities and challenges that, if properly addressed, could accelerate the implementation of AI-assisted risk stratification to future refine and individualize breast cancer screening strategies
External validation of a mammography-derived AI-based risk model in a U.S. breast cancer screening cohort of White and Black women
Despite the demonstrated potential of artificial intelligence (AI) in breast cancer risk assessment for personalizing screening recommendations, further validation is required regarding AI model bias and generalizability. We performed external validation on a U.S. screening cohort of a mammography-derived AI breast cancer risk model originally developed for European screening cohorts. We retrospectively identified 176 breast cancers with exams 3 months to 2 years prior to cancer diagnosis and a random sample of 4963 controls from women with at least one-year negative follow-up. A risk score for each woman was calculated via the AI risk model. Age-adjusted areas under the ROC curves (AUCs) were estimated for the entire cohort and separately for White and Black women. The Gail 5-year risk model was also evaluated for comparison. The overall AUC was 0.68 (95% CIs 0.64-0.72) for all women, 0.67 (0.61-0.72) for White women, and 0.70 (0.65-0.76) for Black women. The AI risk model significantly outperformed the Gail risk model for all wome
Genome-wide association study of breast density among women of African ancestry
Breast density, the amount of fibroglandular versus fatty tissue in the breast, is a strong breast cancer risk factor. Understanding genetic factors associated with breast density may help in clarifying mechanisms by which breast density increases cancer risk. To date, 50 genetic loci have been associated with breast density, however, these studies were performed among predominantly European ancestry populations. We utilized a cohort of women aged 40-85 years who underwent screening mammography and had genetic information available from the Penn Medicine BioBank to conduct a Genome-Wide Association Study (GWAS) of breast density among 1323 women of African ancestry. For each mammogram, the publicly available LIBRA software was used to quantify dense area and area percent density. We identified 34 significant loci associated with dense area and area percent density, with the strongest signals i
Performance Gaps of Artificial Intelligence Models Screening Mammography -- Towards Fair and Interpretable Models
Even though deep learning models for abnormality classification can perform
well in screening mammography, the demographic and imaging characteristics
associated with increased risk of failure for abnormality classification in
screening mammograms remain unclear. This retrospective study used data from
the Emory BrEast Imaging Dataset (EMBED) including mammograms from 115,931
patients imaged at Emory University Healthcare between 2013 to 2020. Clinical
and imaging data includes Breast Imaging Reporting and Data System (BI-RADS)
assessment, region of interest coordinates for abnormalities, imaging features,
pathologic outcomes, and patient demographics. Deep learning models including
InceptionV3, VGG16, ResNet50V2, and ResNet152V2 were developed to distinguish
between patches of abnormal tissue and randomly selected patches of normal
tissue from the screening mammograms. The distributions of the training,
validation and test sets are 29,144 (55.6%) patches of 10,678 (54.2%) patients,
9,910 (18.9%) patches of 3,609 (18.3%) patients, and 13,390 (25.5%) patches of
5,404 (27.5%) patients. We assessed model performance overall and within
subgroups defined by age, race, pathologic outcome, and imaging characteristics
to evaluate reasons for misclassifications. On the test set, a ResNet152V2
model trained to classify normal versus abnormal tissue patches achieved an
accuracy of 92.6% (95%CI=92.0-93.2%), and area under the receiver operative
characteristics curve 0.975 (95%CI=0.972-0.978). Imaging characteristics
associated with higher misclassifications of images include higher tissue
densities (risk ratio [RR]=1.649; p=.010, BI-RADS density C and RR=2.026;
p=.003, BI-RADS density D), and presence of architectural distortion (RR=1.026;
p<.001). Small but statistically significant differences in performance were
observed by age, race, pathologic outcome, and other imaging features (p<.001).Comment: 21 pages, 4 tables, 5 figures, 2 supplemental table and 1
supplemental figur
Impact of multi-source data augmentation on performance of convolutional neural networks for abnormality classification in mammography
IntroductionTo date, most mammography-related AI models have been trained using either film or digital mammogram datasets with little overlap. We investigated whether or not combining film and digital mammography during training will help or hinder modern models designed for use on digital mammograms.MethodsTo this end, a total of six binary classifiers were trained for comparison. The first three classifiers were trained using images only from Emory Breast Imaging Dataset (EMBED) using ResNet50, ResNet101, and ResNet152 architectures. The next three classifiers were trained using images from EMBED, Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM), and Digital Database for Screening Mammography (DDSM) datasets. All six models were tested only on digital mammograms from EMBED.ResultsThe results showed that performance degradation to the customized ResNet models was statistically significant overall when EMBED dataset was augmented with CBIS-DDSM/DDSM. While the performance degradation was observed in all racial subgroups, some races are subject to more severe performance drop as compared to other races.DiscussionThe degradation may potentially be due to (
1) a mismatch in features between film-based and digital mammograms (
2) a mismatch in pathologic and radiological information. In conclusion, use of both film and digital mammography during training may hinder modern models designed for breast cancer screening. Caution is required when combining film-based and digital mammograms or when utilizing pathologic and radiological information simultaneously
GaNDLF: A Generally Nuanced Deep Learning Framework for Scalable End-to-End Clinical Workflows in Medical Imaging
Deep Learning (DL) has greatly highlighted the potential impact of optimized machine learning in both the scientific and clinical communities. The advent of open-source DL libraries from major industrial entities, such as TensorFlow (Google), PyTorch (Facebook), and MXNet (Apache), further contributes to DL promises on the democratization of computational analytics. However, increased technical and specialized background is required to develop DL algorithms, and the variability of implementation details hinders their reproducibility. Towards lowering the barrier and making the mechanism of DL development, training, and inference more stable, reproducible, and scalable, without requiring an extensive technical background, this manuscript proposes the Generally Nuanced Deep Learning Framework (GaNDLF). With built-in support for k-fold cross-validation, data augmentation, multiple modalities and output classes, and multi-GPU training, as well as the ability to work with both radiographic and histologic imaging, GaNDLF aims to provide an end-to-end solution for all DL-related tasks, to tackle problems in medical imaging and provide a robust application framework for deployment in clinical workflows