1,886 research outputs found
Bilevel Fast Scene Adaptation for Low-Light Image Enhancement
Enhancing images in low-light scenes is a challenging but widely concerned
task in the computer vision. The mainstream learning-based methods mainly
acquire the enhanced model by learning the data distribution from the specific
scenes, causing poor adaptability (even failure) when meeting real-world
scenarios that have never been encountered before. The main obstacle lies in
the modeling conundrum from distribution discrepancy across different scenes.
To remedy this, we first explore relationships between diverse low-light scenes
based on statistical analysis, i.e., the network parameters of the encoder
trained in different data distributions are close. We introduce the bilevel
paradigm to model the above latent correspondence from the perspective of
hyperparameter optimization. A bilevel learning framework is constructed to
endow the scene-irrelevant generality of the encoder towards diverse scenes
(i.e., freezing the encoder in the adaptation and testing phases). Further, we
define a reinforced bilevel learning framework to provide a meta-initialization
for scene-specific decoder to further ameliorate visual quality. Moreover, to
improve the practicability, we establish a Retinex-induced architecture with
adaptive denoising and apply our built learning framework to acquire its
parameters by using two training losses including supervised and unsupervised
forms. Extensive experimental evaluations on multiple datasets verify our
adaptability and competitive performance against existing state-of-the-art
works. The code and datasets will be available at
https://github.com/vis-opt-group/BL
A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts
Machine learning methods strive to acquire a robust model during training
that can generalize well to test samples, even under distribution shifts.
However, these methods often suffer from a performance drop due to unknown test
distributions. Test-time adaptation (TTA), an emerging paradigm, has the
potential to adapt a pre-trained model to unlabeled data during testing, before
making predictions. Recent progress in this paradigm highlights the significant
benefits of utilizing unlabeled data for training self-adapted models prior to
inference. In this survey, we divide TTA into several distinct categories,
namely, test-time (source-free) domain adaptation, test-time batch adaptation,
online test-time adaptation, and test-time prior adaptation. For each category,
we provide a comprehensive taxonomy of advanced algorithms, followed by a
discussion of different learning scenarios. Furthermore, we analyze relevant
applications of TTA and discuss open challenges and promising areas for future
research. A comprehensive list of TTA methods can be found at
\url{https://github.com/tim-learn/awesome-test-time-adaptation}.Comment: Discussions, comments, and questions are all welcomed in
\url{https://github.com/tim-learn/awesome-test-time-adaptation
Efficient Unified Demosaicing for Bayer and Non-Bayer Patterned Image Sensors
As the physical size of recent CMOS image sensors (CIS) gets smaller, the
latest mobile cameras are adopting unique non-Bayer color filter array (CFA)
patterns (e.g., Quad, Nona, QxQ), which consist of homogeneous color units with
adjacent pixels. These non-Bayer sensors are superior to conventional Bayer CFA
thanks to their changeable pixel-bin sizes for different light conditions but
may introduce visual artifacts during demosaicing due to their inherent pixel
pattern structures and sensor hardware characteristics. Previous demosaicing
methods have primarily focused on Bayer CFA, necessitating distinct
reconstruction methods for non-Bayer patterned CIS with various CFA modes under
different lighting conditions. In this work, we propose an efficient unified
demosaicing method that can be applied to both conventional Bayer RAW and
various non-Bayer CFAs' RAW data in different operation modes. Our Knowledge
Learning-based demosaicing model for Adaptive Patterns, namely KLAP, utilizes
CFA-adaptive filters for only 1% key filters in the network for each CFA, but
still manages to effectively demosaic all the CFAs, yielding comparable
performance to the large-scale models. Furthermore, by employing meta-learning
during inference (KLAP-M), our model is able to eliminate unknown
sensor-generic artifacts in real RAW data, effectively bridging the gap between
synthetic images and real sensor RAW. Our KLAP and KLAP-M methods achieved
state-of-the-art demosaicing performance in both synthetic and real RAW data of
Bayer and non-Bayer CFAs
Deep learning based domain adaptation for mitochondria segmentation on EM volumes.
[EN] BACKGROUND AND OBJECTIVE: Accurate segmentation of electron microscopy (EM) volumes of the brain is essential to characterize neuronal structures at a cell or organelle level. While supervised deep learning methods have led to major breakthroughs in that direction during the past years, they usually require large amounts of annotated data to be trained, and perform poorly on other data acquired under similar experimental and imaging conditions. This is a problem known as domain adaptation, since models that learned from a sample distribution (or source domain) struggle to maintain their performance on samples extracted from a different distribution or target domain. In this work, we address the complex case of deep learning based domain adaptation for mitochondria segmentation across EM datasets from different tissues and species.
METHODS: We present three unsupervised domain adaptation strategies to improve mitochondria segmentation in the target domain based on (1) state-of-the-art style transfer between images of both domains; (2) self-supervised learning to pre-train a model using unlabeled source and target images, and then fine-tune it only with the source labels; and (3) multi-task neural network architectures trained end-to-end with both labeled and unlabeled images. Additionally, to ensure good generalization in our models, we propose a new training stopping criterion based on morphological priors obtained exclusively in the source domain. The code and its documentation are publicly available at https://github.com/danifranco/EM_domain_adaptation.
RESULTS: We carried out all possible cross-dataset experiments using three publicly available EM datasets. We evaluated our proposed strategies and those of others based on the mitochondria semantic labels predicted on the target datasets.
CONCLUSIONS: The methods introduced here outperform the baseline methods and compare favorably to the state of the art. In the absence of validation labels, monitoring our proposed morphology-based metric is an intuitive and effective way to stop the training process and select in average optimal models.I. Arganda-Carreras would like to acknowledge the support of the 2020 Leonardo Grant for Researchers and Cultural Creators, BBVA Foundation. This work is supported in part by the University of the Basque Country UPV/EHU grant GIU19/027 and by Ministerio de Ciencia, Innovación y Universidades, Agencia Estatal de Investigación, under grant PID2019-109820RB-I00, MCIN/AEI /10.13039/501100011033/, cofinanced by European Regional Development Fund (ERDF), “A way of making Europe.
Learning Distortion Invariant Representation for Image Restoration from A Causality Perspective
In recent years, we have witnessed the great advancement of Deep neural
networks (DNNs) in image restoration. However, a critical limitation is that
they cannot generalize well to real-world degradations with different degrees
or types. In this paper, we are the first to propose a novel training strategy
for image restoration from the causality perspective, to improve the
generalization ability of DNNs for unknown degradations. Our method, termed
Distortion Invariant representation Learning (DIL), treats each distortion type
and degree as one specific confounder, and learns the distortion-invariant
representation by eliminating the harmful confounding effect of each
degradation. We derive our DIL with the back-door criterion in causality by
modeling the interventions of different distortions from the optimization
perspective. Particularly, we introduce counterfactual distortion augmentation
to simulate the virtual distortion types and degrees as the confounders. Then,
we instantiate the intervention of each distortion with a virtual model
updating based on corresponding distorted images, and eliminate them from the
meta-learning perspective. Extensive experiments demonstrate the effectiveness
of our DIL on the generalization capability for unseen distortion types and
degrees. Our code will be available at
https://github.com/lixinustc/Causal-IR-DIL.Comment: Accepted by CVPR202
Meta-Processing: A robust framework for multi-tasks seismic processing
Machine learning-based seismic processing models are typically trained
separately to perform specific seismic processing tasks (SPTs), and as a
result, require plenty of training data. However, preparing training data sets
is not trivial, especially for supervised learning (SL). Nevertheless, seismic
data of different types and from different regions share generally common
features, such as their sinusoidal nature and geometric texture. To learn the
shared features, and thus, quickly adapt to various SPTs, we develop a unified
paradigm for neural network-based seismic processing, called Meta-Processing,
that uses limited training data for meta learning a common network
initialization, which offers universal adaptability features. The proposed
Meta-Processing framework consists of two stages: meta-training and
meta-testing. In the meta-training stage, each SPT is treated as a separate
task and the training dataset is divided into support and query sets. Unlike
conventional SL methods, here, the neural network (NN) parameters are updated
by a bilevel gradient descent from the support set to the query set, iterating
through all tasks. In the meta-testing stage, we also utilize limited data to
fine-tune the optimized NN parameters in an SL fashion to conduct various SPTs,
such as denoising, interpolation, ground-roll attenuation, image enhancement,
and velocity estimation, aiming to converge quickly to ideal performance.
Comprehensive numerical examples are performed to evaluate the performance of
Meta-Processing on both synthetic and field data. The results demonstrate that
our method significantly improves the convergence speed and prediction accuracy
of the NN
- …