436 research outputs found
RestoreFormer++: Towards Real-World Blind Face Restoration from Undegraded Key-Value Pairs
Blind face restoration aims at recovering high-quality face images from those
with unknown degradations. Current algorithms mainly introduce priors to
complement high-quality details and achieve impressive progress. However, most
of these algorithms ignore abundant contextual information in the face and its
interplay with the priors, leading to sub-optimal performance. Moreover, they
pay less attention to the gap between the synthetic and real-world scenarios,
limiting the robustness and generalization to real-world applications. In this
work, we propose RestoreFormer++, which on the one hand introduces
fully-spatial attention mechanisms to model the contextual information and the
interplay with the priors, and on the other hand, explores an extending
degrading model to help generate more realistic degraded face images to
alleviate the synthetic-to-real-world gap. Compared with current algorithms,
RestoreFormer++ has several crucial benefits. First, instead of using a
multi-head self-attention mechanism like the traditional visual transformer, we
introduce multi-head cross-attention over multi-scale features to fully explore
spatial interactions between corrupted information and high-quality priors. In
this way, it can facilitate RestoreFormer++ to restore face images with higher
realness and fidelity. Second, in contrast to the recognition-oriented
dictionary, we learn a reconstruction-oriented dictionary as priors, which
contains more diverse high-quality facial details and better accords with the
restoration target. Third, we introduce an extending degrading model that
contains more realistic degraded scenarios for training data synthesizing, and
thus helps to enhance the robustness and generalization of our RestoreFormer++
model. Extensive experiments show that RestoreFormer++ outperforms
state-of-the-art algorithms on both synthetic and real-world datasets.Comment: Submitted to TPAMI. An extension of RestoreForme
Hierarchical Disentanglement-Alignment Network for Robust SAR Vehicle Recognition
Vehicle recognition is a fundamental problem in SAR image interpretation.
However, robustly recognizing vehicle targets is a challenging task in SAR due
to the large intraclass variations and small interclass variations.
Additionally, the lack of large datasets further complicates the task. Inspired
by the analysis of target signature variations and deep learning
explainability, this paper proposes a novel domain alignment framework named
the Hierarchical Disentanglement-Alignment Network (HDANet) to achieve
robustness under various operating conditions. Concisely, HDANet integrates
feature disentanglement and alignment into a unified framework with three
modules: domain data generation, multitask-assisted mask disentanglement, and
domain alignment of target features. The first module generates diverse data
for alignment, and three simple but effective data augmentation methods are
designed to simulate target signature variations. The second module
disentangles the target features from background clutter using the
multitask-assisted mask to prevent clutter from interfering with subsequent
alignment. The third module employs a contrastive loss for domain alignment to
extract robust target features from generated diverse data and disentangled
features. Lastly, the proposed method demonstrates impressive robustness across
nine operating conditions in the MSTAR dataset, and extensive qualitative and
quantitative analyses validate the effectiveness of our framework
Very High Resolution (VHR) Satellite Imagery: Processing and Applications
Recently, growing interest in the use of remote sensing imagery has appeared to provide synoptic maps of water quality parameters in coastal and inner water ecosystems;, monitoring of complex land ecosystems for biodiversity conservation; precision agriculture for the management of soils, crops, and pests; urban planning; disaster monitoring, etc. However, for these maps to achieve their full potential, it is important to engage in periodic monitoring and analysis of multi-temporal changes. In this context, very high resolution (VHR) satellite-based optical, infrared, and radar imaging instruments provide reliable information to implement spatially-based conservation actions. Moreover, they enable observations of parameters of our environment at greater broader spatial and finer temporal scales than those allowed through field observation alone. In this sense, recent very high resolution satellite technologies and image processing algorithms present the opportunity to develop quantitative techniques that have the potential to improve upon traditional techniques in terms of cost, mapping fidelity, and objectivity. Typical applications include multi-temporal classification, recognition and tracking of specific patterns, multisensor data fusion, analysis of land/marine ecosystem processes and environment monitoring, etc. This book aims to collect new developments, methodologies, and applications of very high resolution satellite data for remote sensing. The works selected provide to the research community the most recent advances on all aspects of VHR satellite remote sensing
Active shape models with focus on overlapping problems applied to plant detection and soil pore analysis
[no abstract
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Surgical tool segmentation and action recognition are fundamental building
blocks in many computer-assisted intervention applications, ranging from
surgical skills assessment to decision support systems. Nowadays,
learning-based action recognition and segmentation approaches outperform
classical methods, relying, however, on large, annotated datasets. Furthermore,
action recognition and tool segmentation algorithms are often trained and make
predictions in isolation from each other, without exploiting potential
cross-task relationships. With the EndoVis 2022 SAR-RARP50 challenge, we
release the first multimodal, publicly available, in-vivo, dataset for surgical
action recognition and semantic instrumentation segmentation, containing 50
suturing video segments of Robotic Assisted Radical Prostatectomy (RARP). The
aim of the challenge is twofold. First, to enable researchers to leverage the
scale of the provided dataset and develop robust and highly accurate
single-task action recognition and tool segmentation approaches in the surgical
domain. Second, to further explore the potential of multitask-based learning
approaches and determine their comparative advantage against their single-task
counterparts. A total of 12 teams participated in the challenge, contributing 7
action recognition methods, 9 instrument segmentation techniques, and 4
multitask approaches that integrated both action recognition and instrument
segmentation. The complete SAR-RARP50 dataset is available at:
https://rdr.ucl.ac.uk/projects/SARRARP50_Segmentation_of_surgical_instrumentation_and_Action_Recognition_on_Robot-Assisted_Radical_Prostatectomy_Challenge/19109
- …