6 research outputs found
GaNDLF: A Generally Nuanced Deep Learning Framework for Scalable End-to-End Clinical Workflows in Medical Imaging
Deep Learning (DL) has greatly highlighted the potential impact of optimized machine learning in both the scientific and clinical communities. The advent of open-source DL libraries from major industrial entities, such as TensorFlow (Google), PyTorch (Facebook), and MXNet (Apache), further contributes to DL promises on the democratization of computational analytics. However, increased technical and specialized background is required to develop DL algorithms, and the variability of implementation details hinders their reproducibility. Towards lowering the barrier and making the mechanism of DL development, training, and inference more stable, reproducible, and scalable, without requiring an extensive technical background, this manuscript proposes the Generally Nuanced Deep Learning Framework (GaNDLF). With built-in support for k-fold cross-validation, data augmentation, multiple modalities and output classes, and multi-GPU training, as well as the ability to work with both radiographic and histologic imaging, GaNDLF aims to provide an end-to-end solution for all DL-related tasks, to tackle problems in medical imaging and provide a robust application framework for deployment in clinical workflows
Robust Image Population Based Stain Color Normalization: How Many Reference Slides Are Enough?
Histopathologic evaluation of Hematoxylin & Eosin (H&E) stained slides is essential for disease diagnosis, revealing tissue morphology, structure, and cellular composition. Variations in staining protocols and equipment result in images with color nonconformity. Although pathologists compensate for color variations, these disparities introduce inaccuracies in computational whole slide image (WSI) analysis, accentuating data domain shift and degrading generalization. Current state-of-the-art normalization methods employ a single WSI as reference, but selecting a single WSI representative of a complete WSI-cohort is infeasible, inadvertently introducing normalization bias. We seek the optimal number of slides to construct a more representative reference based on composite/aggregate of multiple H&E density histograms and stain-vectors, obtained from a randomly selected WSI population (WSI-Cohort-Subset). We utilized 1,864 IvyGAP WSIs as a WSI-cohort, and built 200 WSI-Cohort-Subsets varying in size (from 1 to 200 WSI-pairs) using randomly selected WSIs. The WSI-pairs' mean Wasserstein Distances and WSI-Cohort-Subsets' standard deviations were calculated. The Pareto Principle defined the optimal WSI-Cohort-Subset size. The WSI-cohort underwent structure-preserving color normalization using the optimal WSI-Cohort-Subset histogram and stain-vector aggregates. Numerous normalization permutations support WSI-Cohort-Subset aggregates as representative of a WSI-cohort through WSI-cohort CIELAB color space swift convergence, as a result of the law of large numbers and shown as a power law distribution. We show normalization at the optimal (Pareto Principle) WSI-Cohort-Subset size and corresponding CIELAB convergence: a) Quantitatively, using 500 WSI-cohorts; b) Quantitatively, using 8,100 WSI-regions; c) Qualitatively, using 30 cellular tumor normalization permutations. Aggregate-based stain normalization may contribute in increasing computational pathology robustness, reproducibility, and integrity
Human pancreatic islet microRNAs implicated in diabetes and related traits by large-scale genetic analysis.
Genetic studies have identified ≥240 loci associated with the risk of type 2 diabetes (T2D), yet most of these loci lie in non-coding regions, masking the underlying molecular mechanisms. Recent studies investigating mRNA expression in human pancreatic islets have yielded important insights into the molecular drivers of normal islet function and T2D pathophysiology. However, similar studies investigating microRNA (miRNA) expression remain limited. Here, we present data from 63 individuals, the largest sequencing-based analysis of miRNA expression in human islets to date. We characterized the genetic regulation of miRNA expression by decomposing the expression of highly heritable miRNAs into cis- and trans-acting genetic components and mapping cis-acting loci associated with miRNA expression [miRNA-expression quantitative trait loci (eQTLs)]. We found i) 84 heritable miRNAs, primarily regulated by trans-acting genetic effects, and ii) 5 miRNA-eQTLs. We also used several different strategies to identify T2D-associated miRNAs. First, we colocalized miRNA-eQTLs with genetic loci associated with T2D and multiple glycemic traits, identifying one miRNA, miR-1908, that shares genetic signals for blood glucose and glycated hemoglobin (HbA1c). Next, we intersected miRNA seed regions and predicted target sites with credible set SNPs associated with T2D and glycemic traits and found 32 miRNAs that may have altered binding and function due to disrupted seed regions. Finally, we performed differential expression analysis and identified 14 miRNAs associated with T2D status-including miR-187-3p, miR-21-5p, miR-668, and miR-199b-5p-and 4 miRNAs associated with a polygenic score for HbA1c levels-miR-216a, miR-25, miR-30a-3p, and miR-30a-5p
Recommended from our members
Single-cell transcriptomic profiling of human pancreatic islets reveals genes responsive to glucose exposure over 24 hours
Aims/hypothesis: Disruption of pancreatic islet function and glucose homeostasis can lead to the development of sustained hyperglycemia, beta cell glucotoxicity, and subsequently type 2 diabetes. In this study, we explored the effects of in vitro hyperglycemic conditions on human pancreatic islet gene expression across 24 hours in six pancreatic cell types: alpha, beta, gamma, delta, ductal, and acinar cells. We hypothesized that genes associated with hyperglycemic conditions may be relevant to the onset and progression of diabetes.
Methods: We exposed human pancreatic islets from two donors to low (2.8 mmol/l) and high (15.0 mmol/l) glucose concentrations over 24 hours in vitro. To assess the transcriptome, we performed single-cell RNA sequencing (scRNA-seq) at seven time points. We modeled time as both a discrete and continuous variable to determine momentary and longitudinal changes in transcription associated with islet time in culture or glucose exposure. Additionally, we integrated genomic features and genetic summary statistics to nominate candidate effector genes. For three of these genes, we functionally characterized the effect on insulin production and secretion using CRISPR interference to knockdown gene expression in EndoC-βH1 cells, followed by a glucose-stimulated insulin secretion assay.
Results: Across all cell types, we identified 1,447 genes associated with time, 680 genes associated with glucose exposure, and 418 genes associated with interaction effects between time and glucose. By integrating these expression profiles with summary statistics from genetic association studies, we identified 2,449 candidate effector genes for type 2 diabetes, HbA1c, random blood glucose, and fasting blood glucose. Of these candidate effector genes, we showed that three—ERO1B, HNRNPA2B1, and RHOBTB3—exhibited an effect on glucose-stimulated insulin secretion and production in EndoC-βH1 cells.
Conclusions/interpretation: The findings of our study provide an in-depth characterization of the 24-hour transcriptomic response of human pancreatic islets to glucose exposure at a single-cell resolution. By integrating differentially expressed genes with genetic signals for type 2 diabetes and glucose-related traits, we provide insights into the molecular mechanisms underlying glucose homeostasis. Finally, we provide functional evidence to support the role of three candidate effector genes in insulin secretion and production
GaNDLF: the generally nuanced deep learning framework for scalable end-to-end clinical workflows
<h2>What's Changed</h2>
<ul>
<li>Version update for development by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/648</li>
<li>Added citation file by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/654</li>
<li>Added new optimizers by @AdiSir05 in https://github.com/mlcommons/GaNDLF/pull/646</li>
<li>Allow histology patches to be extracted without ground truth labels by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/657</li>
<li>Added metric calculation from CLI by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/663</li>
<li>Added a few segmentation metrics by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/661</li>
<li>Repository badges have been updated by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/667</li>
<li>Added instructions on creating new tutorials by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/664</li>
<li>Ensure parameters are built into the model dictionary by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/673</li>
<li>Calculating penalty after all compute objects are initialized by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/675</li>
<li>Add image similarity metrics by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/669</li>
<li>Allow the penalty and class weights in the config to be used by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/677</li>
<li>Added documentation related to OpenFL by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/683</li>
<li>Add MLCube wrapper for metrics API by @hasan7n in https://github.com/mlcommons/GaNDLF/pull/681</li>
<li>Adding mechanism to curate each extracted patch by @shubhaminnani in https://github.com/mlcommons/GaNDLF/pull/653</li>
<li>Added mask to SSIM function call by @FelixSteinbauer in https://github.com/mlcommons/GaNDLF/pull/685</li>
<li>Removed history file by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/690</li>
<li>Updated the metrics output by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/687</li>
<li>Update docker image name in workflow by @hasan7n in https://github.com/mlcommons/GaNDLF/pull/692</li>
<li>Fixed plotting function for final stats by @Geeks-Sid in https://github.com/mlcommons/GaNDLF/pull/691</li>
<li>Fixed import for collect stats by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/694</li>
<li>HED augmentations for digital pathology image by @Geeks-Sid in https://github.com/mlcommons/GaNDLF/pull/649</li>
<li>Added focal loss by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/696</li>
<li>Added a temporary fix for protobuf by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/702</li>
<li>Use torchmetric PSNR implementation and argument ordering by @FelixSteinbauer in https://github.com/mlcommons/GaNDLF/pull/693</li>
<li>Introduced percentile normalization for synthesis challenge metrics by @FelixSteinbauer in https://github.com/mlcommons/GaNDLF/pull/700</li>
<li>Upgrade openvino version to latest by @Geeks-Sid in https://github.com/mlcommons/GaNDLF/pull/699</li>
<li>Additional PSNR evaluations for the normalized synthesis case by @FelixSteinbauer in https://github.com/mlcommons/GaNDLF/pull/703</li>
<li>Improved formatting by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/707</li>
<li>Updated checkout version and test names for clarity by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/708</li>
<li>Updated default options for sgd by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/709</li>
<li>Added matthews correlation coefficient loss by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/706</li>
<li>Using tuples for PSNR datarange by @FelixSteinbauer in https://github.com/mlcommons/GaNDLF/pull/712</li>
<li>Deploy model entrypoint by @hasan7n in https://github.com/mlcommons/GaNDLF/pull/711</li>
<li>Added parameter to toggle NCC computation by @FelixSteinbauer in https://github.com/mlcommons/GaNDLF/pull/717</li>
<li>Adding second classification tutorial by @vavali08 in https://github.com/mlcommons/GaNDLF/pull/698</li>
<li>Minor code refactoring by @tosemml in https://github.com/mlcommons/GaNDLF/pull/719</li>
<li>Combined writing and temp file creation in a single step by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/720</li>
<li>Update usage information for anonymizer by @sanashah007 in https://github.com/mlcommons/GaNDLF/pull/716</li>
<li>Move unit testing data to the mlcommons storage by @sarthakpati in https://github.com/mlcommons/GaNDLF/pull/722</li>
<li>Fixed model saving when git repo not found by @scap3yvt in https://github.com/mlcommons/GaNDLF/pull/729</li>
<li>Removing dev from version for tagging by @scap3yvt in https://github.com/mlcommons/GaNDLF/pull/731</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li>@AdiSir05 made their first contribution in https://github.com/mlcommons/GaNDLF/pull/646</li>
<li>@shubhaminnani made their first contribution in https://github.com/mlcommons/GaNDLF/pull/653</li>
<li>@FelixSteinbauer made their first contribution in https://github.com/mlcommons/GaNDLF/pull/685</li>
<li>@vavali08 made their first contribution in https://github.com/mlcommons/GaNDLF/pull/698</li>
<li>@tosemml made their first contribution in https://github.com/mlcommons/GaNDLF/pull/719</li>
<li>@sanashah007 made their first contribution in https://github.com/mlcommons/GaNDLF/pull/716</li>
<li>@scap3yvt made their first contribution in https://github.com/mlcommons/GaNDLF/pull/729</li>
</ul>
<p><strong>Full Changelog</strong>: https://github.com/mlcommons/GaNDLF/compare/0.0.16...0.0.17</p>If you use this software, please cite it using this manuscript