141 research outputs found
Towards domain-invariant Self-Supervised Learning with Batch Styles Standardization
In Self-Supervised Learning (SSL), models are typically pretrained,
fine-tuned, and evaluated on the same domains. However, they tend to perform
poorly when evaluated on unseen domains, a challenge that Unsupervised Domain
Generalization (UDG) seeks to address. Current UDG methods rely on domain
labels, which are often challenging to collect, and domain-specific
architectures that lack scalability when confronted with numerous domains,
making the current methodology impractical and rigid. Inspired by
contrastive-based UDG methods that mitigate spurious correlations by
restricting comparisons to examples from the same domain, we hypothesize that
eliminating style variability within a batch could provide a more convenient
and flexible way to reduce spurious correlations without requiring domain
labels. To verify this hypothesis, we introduce Batch Styles Standardization
(BSS), a relatively simple yet powerful Fourier-based method to standardize the
style of images in a batch specifically designed for integration with SSL
methods to tackle UDG. Combining BSS with existing SSL methods offers serious
advantages over prior UDG methods: (1) It eliminates the need for domain labels
or domain-specific network components to enhance domain-invariance in SSL
representations, and (2) offers flexibility as BSS can be seamlessly integrated
with diverse contrastive-based but also non-contrastive-based SSL methods.
Experiments on several UDG datasets demonstrate that it significantly improves
downstream task performances on unseen domains, often outperforming or rivaling
with UDG methods. Finally, this work clarifies the underlying mechanisms
contributing to BSS's effectiveness in improving domain-invariance in SSL
representations and performances on unseen domain.Comment: Accepted at ICLR 202
Building detection in very high resolution multispectral data with deep learning features
International audienceThe automated man-made object detection and building extraction from single satellite images is, still, one of the most challenging tasks for various urban planning and monitoring engineering applications. To this end, in this paper we propose an automated building detection framework from very high resolution remote sensing data based on deep convolu-tional neural networks. The core of the developed method is based on a supervised classification procedure employing a very large training dataset. An MRF model is then responsible for obtaining the optimal labels regarding the detection of scene buildings. The experimental results and the performed quantitative validation indicate the quite promising potentials of the developed approach
Multi-center anatomical segmentation with heterogeneous labels via landmark-based models
Learning anatomical segmentation from heterogeneous labels in multi-center
datasets is a common situation encountered in clinical scenarios, where certain
anatomical structures are only annotated in images coming from particular
medical centers, but not in the full database. Here we first show how
state-of-the-art pixel-level segmentation models fail in naively learning this
task due to domain memorization issues and conflicting labels. We then propose
to adopt HybridGNet, a landmark-based segmentation model which learns the
available anatomical structures using graph-based representations. By analyzing
the latent space learned by both models, we show that HybridGNet naturally
learns more domain-invariant feature representations, and provide empirical
evidence in the context of chest X-ray multiclass segmentation. We hope these
insights will shed light on the training of deep learning models with
heterogeneous labels from public and multi-center datasets
Structured State Space Models for Multiple Instance Learning in Digital Pathology
Multiple instance learning is an ideal mode of analysis for histopathology
data, where vast whole slide images are typically annotated with a single
global label. In such cases, a whole slide image is modelled as a collection of
tissue patches to be aggregated and classified. Common models for performing
this classification include recurrent neural networks and transformers.
Although powerful compression algorithms, such as deep pre-trained neural
networks, are used to reduce the dimensionality of each patch, the sequences
arising from whole slide images remain excessively long, routinely containing
tens of thousands of patches. Structured state space models are an emerging
alternative for sequence modelling, specifically designed for the efficient
modelling of long sequences. These models invoke an optimal projection of an
input sequence into memory units that compress the entire sequence. In this
paper, we propose the use of state space models as a multiple instance learner
to a variety of problems in digital pathology. Across experiments in metastasis
detection, cancer subtyping, mutation classification, and multitask learning,
we demonstrate the competitiveness of this new class of models with existing
state of the art approaches. Our code is available at
https://github.com/MICS-Lab/s4_digital_pathology
On the detection of Out-Of-Distribution samples in Multiple Instance Learning
The deployment of machine learning solutions in real-world scenarios often
involves addressing the challenge of out-of-distribution (OOD) detection. While
significant efforts have been devoted to OOD detection in classical supervised
settings, the context of weakly supervised learning, particularly the Multiple
Instance Learning (MIL) framework, remains under-explored. In this study, we
tackle this challenge by adapting post-hoc OOD detection methods to the MIL
setting while introducing a novel benchmark specifically designed to assess OOD
detection performance in weakly supervised scenarios. Extensive experiments
based on diverse public datasets do not reveal a single method with a clear
advantage over the others. Although DICE emerges as the best-performing method
overall, it exhibits significant shortcomings on some datasets, emphasizing the
complexity of this under-explored and challenging topic. Our findings shed
light on the complex nature of OOD detection under the MIL framework,
emphasizing the importance of developing novel, robust, and reliable methods
that can generalize effectively in a weakly supervised context. The code for
the paper is available here: https://github.com/loic-lb/OOD_MIL
SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology
Introducing interpretability and reasoning into Multiple Instance Learning
(MIL) methods for Whole Slide Image (WSI) analysis is challenging, given the
complexity of gigapixel slides. Traditionally, MIL interpretability is limited
to identifying salient regions deemed pertinent for downstream tasks, offering
little insight to the end-user (pathologist) regarding the rationale behind
these selections. To address this, we propose Self-Interpretable MIL (SI-MIL),
a method intrinsically designed for interpretability from the very outset.
SI-MIL employs a deep MIL framework to guide an interpretable branch grounded
on handcrafted pathological features, facilitating linear predictions. Beyond
identifying salient regions, SI-MIL uniquely provides feature-level
interpretations rooted in pathological insights for WSIs. Notably, SI-MIL, with
its linear prediction constraints, challenges the prevalent myth of an
inevitable trade-off between model interpretability and performance,
demonstrating competitive results compared to state-of-the-art methods on
WSI-level prediction tasks across three cancer types. In addition, we
thoroughly benchmark the local and global-interpretability of SI-MIL in terms
of statistical analysis, a domain expert study, and desiderata of
interpretability, namely, user-friendliness and faithfulness
Learn2Reg:comprehensive multi-task medical image registration challenge, dataset and evaluation in the era of deep learning
Image registration is a fundamental medical image analysis task, and a wide variety of approaches have been proposed. However, only a few studies have comprehensively compared medical image registration approaches on a wide range of clinically relevant tasks. This limits the development of registration methods, the adoption of research advances into practice, and a fair benchmark across competing approaches. The Learn2Reg challenge addresses these limitations by providing a multi-task medical image registration data set for comprehensive characterisation of deformable registration algorithms. A continuous evaluation will be possible at https:// learn2reg.grand-challenge.org. Learn2Reg covers a wide range of anatomies (brain, abdomen, and thorax), modalities (ultrasound, CT, MR), availability of annotations, as well as intra- and inter-patient registration evaluation. We established an easily accessible framework for training and validation of 3D registration methods, which enabled the compilation of results of over 65 individual method submissions from more than 20 unique teams. We used a complementary set of metrics, including robustness, accuracy, plausibility, and runtime, enabling unique insight into the current state-of-the-art of medical image registration. This paper describes datasets, tasks, evaluation methods and results of the challenge, as well as results of further analysis of transferability to new datasets, the importance of label supervision, and resulting bias. While no single approach worked best across all tasks, many methodological aspects could be identified that push the performance of medical image registration to new state-of-the-art performance. Furthermore, we demystified the common belief that conventional registration methods have to be much slower than deep-learning-based methods.</p
Automatic Descriptor-Based Co-Registration of Frame Hyperspectral Data
Frame hyperspectral sensors, in contrast to push-broom or line-scanning ones, produce hyperspectral datasets with, in general, better geometry but with unregistered spectral bands. Being acquired at different instances and due to platform motion and movements (UAVs, aircrafts, etc.), every spectral band is displaced and acquired with a different geometry. The automatic and accurate registration of hyperspectral datasets from frame sensors remains a challenge. Powerful local feature descriptors when computed over the spectrum fail to extract enough correspondences and successfully complete the registration procedure. To this end, we propose a generic and automated framework which decomposes the problem and enables the efficient computation of a sufficient amount of accurate correspondences over the given spectrum, without using any ancillary data (e.g., from GPS/IMU). First, the spectral bands are divided in spectral groups according to their wavelength. The spectral borders of each group are not strict and their formulation allows certain overlaps. The spectral variance and proximity determine the applicability of every spectral band to act as a reference during the registration procedure. The proposed decomposition allows the descriptor and the robust estimation process to deliver numerous inliers. The search space of possible solutions has been effectively narrowed by sorting and selecting the optimal spectral bands which under an unsupervised manner can quickly recover hypercube’s geometry. The developed approach has been qualitatively and quantitatively evaluated with six different datasets obtained by frame sensors onboard aerial platforms and UAVs. Experimental results appear promising
- …