406 research outputs found
Regression Concept Vectors for Bidirectional Explanations in Histopathology
Explanations for deep neural network predictions in terms of domain-related
concepts can be valuable in medical applications, where justifications are
important for confidence in the decision-making. In this work, we propose a
methodology to exploit continuous concept measures as Regression Concept
Vectors (RCVs) in the activation space of a layer. The directional derivative
of the decision function along the RCVs represents the network sensitivity to
increasing values of a given concept measure. When applied to breast cancer
grading, nuclei texture emerges as a relevant concept in the detection of tumor
tissue in breast lymph node samples. We evaluate score robustness and
consistency by statistical analysis.Comment: 9 pages, 3 figures, 3 table
Disentangling Neuron Representations with Concept Vectors
Mechanistic interpretability aims to understand how models store
representations by breaking down neural networks into interpretable units.
However, the occurrence of polysemantic neurons, or neurons that respond to
multiple unrelated features, makes interpreting individual neurons challenging.
This has led to the search for meaningful vectors, known as concept vectors, in
activation space instead of individual neurons. The main contribution of this
paper is a method to disentangle polysemantic neurons into concept vectors
encapsulating distinct features. Our method can search for fine-grained
concepts according to the user's desired level of concept separation. The
analysis shows that polysemantic neurons can be disentangled into directions
consisting of linear combinations of neurons. Our evaluations show that the
concept vectors found encode coherent, human-understandable features
Learning Interpretable Microscopic Features of Tumor by Multi-task Adversarial CNNs Improves Generalization
Adopting Convolutional Neural Networks (CNNs) in the daily routine of primary
diagnosis requires not only near-perfect precision, but also a sufficient
degree of generalization to data acquisition shifts and transparency. Existing
CNN models act as black boxes, not ensuring to the physicians that important
diagnostic features are used by the model. Building on top of successfully
existing techniques such as multi-task learning, domain adversarial training
and concept-based interpretability, this paper addresses the challenge of
introducing diagnostic factors in the training objectives. Here we show that
our architecture, by learning end-to-end an uncertainty-based weighting
combination of multi-task and adversarial losses, is encouraged to focus on
pathology features such as density and pleomorphism of nuclei, e.g. variations
in size and appearance, while discarding misleading features such as staining
differences. Our results on breast lymph node tissue show significantly
improved generalization in the detection of tumorous tissue, with best average
AUC 0.89 (0.01) against the baseline AUC 0.86 (0.005). By applying the
interpretability technique of linearly probing intermediate representations, we
also demonstrate that interpretable pathology features such as nuclei density
are learned by the proposed CNN architecture, confirming the increased
transparency of this model. This result is a starting point towards building
interpretable multi-task architectures that are robust to data heterogeneity.
Our code is available at https://bit.ly/356yQ2u.Comment: 21 pages, 4 figure
Breast Histopathology with High-Performance Computing and Deep Learning
The increasingly intensive collection of digitalized images of tumor tissue over the last decade made histopathology a demanding application in terms of computational and storage resources. With images containing billions of pixels, the need for optimizing and adapting histopathology to large-scale data analysis is compelling. This paper presents a modular pipeline with three independent layers for the detection of tumoros regions in digital specimens of breast lymph nodes with deep learning models. Our pipeline can be deployed either on local machines or high-performance computing resources with a containerized approach. The need for expertise in high-performance computing is removed by the self-sufficient structure of Docker containers, whereas a large possibility for customization is left in terms of deep learning models and hyperparameters optimization. We show that by deploying the software layers in different infrastructures we optimize both the data preprocessing and the network training times, further increasing the scalability of the application to datasets of approximatively 43 million images. The code is open source and available on Github
Uncovering Unique Concept Vectors through Latent Space Decomposition
Interpreting the inner workings of deep learning models is crucial for
establishing trust and ensuring model safety. Concept-based explanations have
emerged as a superior approach that is more interpretable than feature
attribution estimates such as pixel saliency. However, defining the concepts
for the interpretability analysis biases the explanations by the user's
expectations on the concepts. To address this, we propose a novel post-hoc
unsupervised method that automatically uncovers the concepts learned by deep
models during training. By decomposing the latent space of a layer in singular
vectors and refining them by unsupervised clustering, we uncover concept
vectors aligned with directions of high variance that are relevant to the model
prediction, and that point to semantically distinct concepts. Our extensive
experiments reveal that the majority of our concepts are readily understandable
to humans, exhibit coherency, and bear relevance to the task at hand. Moreover,
we showcase the practical utility of our method in dataset exploration, where
our concept vectors successfully identify outlier training samples affected by
various confounding factors. This novel exploration technique has remarkable
versatility to data types and model architectures and it will facilitate the
identification of biases and the discovery of sources of error within training
data
Novel structural-scale uncertainty measures and error retention curves: application to multiple sclerosis
This paper focuses on the uncertainty estimation for white matter lesions
(WML) segmentation in magnetic resonance imaging (MRI). On one side,
voxel-scale segmentation errors cause the erroneous delineation of the lesions;
on the other side, lesion-scale detection errors lead to wrong lesion counts.
Both of these factors are clinically relevant for the assessment of multiple
sclerosis patients. This work aims to compare the ability of different voxel-
and lesion-scale uncertainty measures to capture errors related to segmentation
and lesion detection, respectively. Our main contributions are (i) proposing
new measures of lesion-scale uncertainty that do not utilise voxel-scale
uncertainties; (ii) extending an error retention curves analysis framework for
evaluation of lesion-scale uncertainty measures. Our results obtained on the
multi-center testing set of 58 patients demonstrate that the proposed
lesion-scale measure achieves the best performance among the analysed measures.
All code implementations are provided at
https://github.com/NataliiaMolch/MS_WML_uncsComment: 4 pages, 2 figures, 3 tables, ISBI preprin
PROCESS Data Infrastructure and Data Services
Due to energy limitation and high operational costs, it is likely that exascale computing will not be achieved by one or two datacentres but will require many more. A simple calculation, which aggregates the computation power of the 2017 Top500 supercomputers, can only reach 418 petaflops. Companies like Rescale, which claims 1.4 exaflops of peak computing power, describes its infrastructure as composed of 8 million servers spread across 30 datacentres. Any proposed solution to address exascale computing challenges has to take into consideration these facts and by design should aim to support the use of geographically distributed and likely independent datacentres. It should also consider, whenever possible, the co-allocation of the storage with the computation as it would take 3 years to transfer 1 exabyte on a dedicated 100 Gb Ethernet connection. This means we have to be smart about managing data more and more geographically dispersed and spread across different administrative domains. As the natural settings of the PROCESS project is to operate within the European Research Infrastructure and serve the European research communities facing exascale challenges, it is important that PROCESS architecture and solutions are well positioned within the European computing and data management landscape namely PRACE, EGI, and EUDAT. In this paper we propose a scalable and programmable data infrastructure that is easy to deploy and can be tuned to support various data-intensive scientific applications
Structural-Based Uncertainty in Deep Learning Across Anatomical Scales: Analysis in White Matter Lesion Segmentation
This paper explores uncertainty quantification (UQ) as an indicator of the
trustworthiness of automated deep-learning (DL) tools in the context of white
matter lesion (WML) segmentation from magnetic resonance imaging (MRI) scans of
multiple sclerosis (MS) patients. Our study focuses on two principal aspects of
uncertainty in structured output segmentation tasks. Firstly, we postulate that
a good uncertainty measure should indicate predictions likely to be incorrect
with high uncertainty values. Second, we investigate the merit of quantifying
uncertainty at different anatomical scales (voxel, lesion, or patient). We
hypothesize that uncertainty at each scale is related to specific types of
errors. Our study aims to confirm this relationship by conducting separate
analyses for in-domain and out-of-domain settings. Our primary methodological
contributions are (i) the development of novel measures for quantifying
uncertainty at lesion and patient scales, derived from structural prediction
discrepancies, and (ii) the extension of an error retention curve analysis
framework to facilitate the evaluation of UQ performance at both lesion and
patient scales. The results from a multi-centric MRI dataset of 172 patients
demonstrate that our proposed measures more effectively capture model errors at
the lesion and patient scales compared to measures that average voxel-scale
uncertainty values. We provide the UQ protocols code at
https://github.com/Medical-Image-Analysis-Laboratory/MS_WML_uncs.Comment: Preprint submitted to the journa
Reference Exascale Architecture (Extended Version)
While political commitments for building exascale systems have been made, turning these systems into platforms for a wide range of exascale applications faces several technical, organisational and skills-related challenges. The key technical challenges are related to the availability of data. While the first exascale machines are likely to be built within a single site, the input data is in many cases impossible to store within a single site. Alongside handling of extreme-large amount of data, the exascale system has to process data from different sources, support accelerated computing, handle high volume of requests per day, minimize the size of data flows, and be extensible in terms of continuously increasing data as well as an increase in parallel requests being sent. These technical challenges are addressed by the general reference exascale architecture. It is divided into three main blocks: virtualization layer, distributed virtual file system, and manager of computing resources. Its main property is modularity which is achieved by containerization at two levels: 1) application containers - containerization of scientific workflows, 2) micro-infrastructure - containerization of extreme-large data service-oriented infrastructure. The paper also presents an instantiation of the reference architecture - the architecture of the PROCESS project (PROviding Computing solutions for ExaScale ChallengeS) and discusses its relation to the reference exascale architecture. The PROCESS architecture has been used as an exascale platform within various exascale pilot applications. This paper also presents performance modelling of exascale platform with its validation
- …