335 research outputs found
Pre-train, Adapt and Detect: Multi-Task Adapter Tuning for Camouflaged Object Detection
Camouflaged object detection (COD), aiming to segment camouflaged objects
which exhibit similar patterns with the background, is a challenging task. Most
existing works are dedicated to establishing specialized modules to identify
camouflaged objects with complete and fine details, while the boundary can not
be well located for the lack of object-related semantics. In this paper, we
propose a novel ``pre-train, adapt and detect" paradigm to detect camouflaged
objects. By introducing a large pre-trained model, abundant knowledge learned
from massive multi-modal data can be directly transferred to COD. A lightweight
parallel adapter is inserted to adjust the features suitable for the downstream
COD task. Extensive experiments on four challenging benchmark datasets
demonstrate that our method outperforms existing state-of-the-art COD models by
large margins. Moreover, we design a multi-task learning scheme for tuning the
adapter to exploit the shareable knowledge across different semantic classes.
Comprehensive experimental results showed that the generalization ability of
our model can be substantially improved with multi-task adapter initialization
on source tasks and multi-task adaptation on target tasks
Deep Learning for Single Image Super-Resolution: A Brief Review
Single image super-resolution (SISR) is a notoriously challenging ill-posed
problem, which aims to obtain a high-resolution (HR) output from one of its
low-resolution (LR) versions. To solve the SISR problem, recently powerful deep
learning algorithms have been employed and achieved the state-of-the-art
performance. In this survey, we review representative deep learning-based SISR
methods, and group them into two categories according to their major
contributions to two essential aspects of SISR: the exploration of efficient
neural network architectures for SISR, and the development of effective
optimization objectives for deep SISR learning. For each category, a baseline
is firstly established and several critical limitations of the baseline are
summarized. Then representative works on overcoming these limitations are
presented based on their original contents as well as our critical
understandings and analyses, and relevant comparisons are conducted from a
variety of perspectives. Finally we conclude this review with some vital
current challenges and future trends in SISR leveraging deep learning
algorithms.Comment: Accepted by IEEE Transactions on Multimedia (TMM
Scrutinizing and De-Biasing Intuitive Physics with Neural Stethoscopes
Visually predicting the stability of block towers is a popular task in the
domain of intuitive physics. While previous work focusses on prediction
accuracy, a one-dimensional performance measure, we provide a broader analysis
of the learned physical understanding of the final model and how the learning
process can be guided. To this end, we introduce neural stethoscopes as a
general purpose framework for quantifying the degree of importance of specific
factors of influence in deep neural networks as well as for actively promoting
and suppressing information as appropriate. In doing so, we unify concepts from
multitask learning as well as training with auxiliary and adversarial losses.
We apply neural stethoscopes to analyse the state-of-the-art neural network for
stability prediction. We show that the baseline model is susceptible to being
misled by incorrect visual cues. This leads to a performance breakdown to the
level of random guessing when training on scenarios where visual cues are
inversely correlated with stability. Using stethoscopes to promote meaningful
feature extraction increases performance from 51% to 90% prediction accuracy.
Conversely, training on an easy dataset where visual cues are positively
correlated with stability, the baseline model learns a bias leading to poor
performance on a harder dataset. Using an adversarial stethoscope, the network
is successfully de-biased, leading to a performance increase from 66% to 88%
On orthogonal projections for dimension reduction and applications in augmented target loss functions for learning problems
The use of orthogonal projections on high-dimensional input and target data
in learning frameworks is studied. First, we investigate the relations between
two standard objectives in dimension reduction, preservation of variance and of
pairwise relative distances. Investigations of their asymptotic correlation as
well as numerical experiments show that a projection does usually not satisfy
both objectives at once. In a standard classification problem we determine
projections on the input data that balance the objectives and compare
subsequent results. Next, we extend our application of orthogonal projections
to deep learning tasks and introduce a general framework of augmented target
loss functions. These loss functions integrate additional information via
transformations and projections of the target data. In two supervised learning
problems, clinical image segmentation and music information classification, the
application of our proposed augmented target loss functions increase the
accuracy
7T-guided super-resolution of 3T MRI
High-resolution MR images can depict rich details of brain anatomical structures and show subtle changes in longitudinal data. 7T MRI scanners can acquire MR images with higher resolution and better tissue contrast than the routine 3T MRI scanners. However, 7T MRI scanners are currently more expensive and less available in clinical and research centers. To this end, we propose a method to generate super-resolution 3T MRI that resembles 7T MRI, which is called as 7T-like MR image in this paper
- …