835 research outputs found
F3Net: Fusion, Feedback and Focus for Salient Object Detection
Most of existing salient object detection models have achieved great progress
by aggregating multi-level features extracted from convolutional neural
networks. However, because of the different receptive fields of different
convolutional layers, there exists big differences between features generated
by these layers. Common feature fusion strategies (addition or concatenation)
ignore these differences and may cause suboptimal solutions. In this paper, we
propose the F3Net to solve above problem, which mainly consists of cross
feature module (CFM) and cascaded feedback decoder (CFD) trained by minimizing
a new pixel position aware loss (PPA). Specifically, CFM aims to selectively
aggregate multi-level features. Different from addition and concatenation, CFM
adaptively selects complementary components from input features before fusion,
which can effectively avoid introducing too much redundant information that may
destroy the original features. Besides, CFD adopts a multi-stage feedback
mechanism, where features closed to supervision will be introduced to the
output of previous layers to supplement them and eliminate the differences
between features. These refined features will go through multiple similar
iterations before generating the final saliency maps. Furthermore, different
from binary cross entropy, the proposed PPA loss doesn't treat pixels equally,
which can synthesize the local structure information of a pixel to guide the
network to focus more on local details. Hard pixels from boundaries or
error-prone parts will be given more attention to emphasize their importance.
F3Net is able to segment salient object regions accurately and provide clear
local details. Comprehensive experiments on five benchmark datasets demonstrate
that F3Net outperforms state-of-the-art approaches on six evaluation metrics.Comment: Accepted by AAAI2020, https://github.com/weijun88/F3Ne
The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation
Few-shot image generation is a challenging task since it aims to generate
diverse new images for an unseen category with only a few images. Existing
methods suffer from the trade-off between the quality and diversity of
generated images. To tackle this problem, we propose Hyperbolic Attribute
Editing (HAE), a simple yet effective method. Unlike other methods that work in
Euclidean space, HAE captures the hierarchy among images using data from seen
categories in hyperbolic space. Given a well-trained HAE, images of unseen
categories can be generated by moving the latent code of a given image toward
any meaningful directions in the Poincar\'e disk with a fixing radius. Most
importantly, the hyperbolic space allows us to control the semantic diversity
of the generated images by setting different radii in the disk. Extensive
experiments and visualizations demonstrate that HAE is capable of not only
generating images with promising quality and diversity using limited data but
achieving a highly controllable and interpretable editing process
Development of Metamaterial EBG Absorbers for Application of Wireless Inter/Intrachip Communication Systems
First, the chapter presents a novel design of electromagnetic bandgap (EBG) absorber with the characteristics of broad bandwidth, low profile, and polarization‐independence to a normal incident electromagnetic wave. The absorber is composed of three consecutive octagon or decagon loops, and highly‐resistive frequency selective surface (FSS) layers. Second, based on the feature of the designed absorber unit, a broadband, metamaterial absorber‐bounded, wireless inter/intrachip (WIIC) communication channel is constructed at the center frequency of 60 GHz. Third, in order to validate the developed methodology used in WIIC analysis, a wired channel on a conventional PCB has been measured, simulated, and analyzed. Fourth, with the extracted S‐parameters of the WIIC system and wired PCB channel, the system impulse responses and transfer functions of the investigated channels have been further extracted, which are used for validation and BER analysis of the WIIC system. Finally, it has been shown that based on the derived BER results, the performance of the designed WIIC channel is close to that of an additive Gaussian white noise (AWGN) channel when the WIIC transceivers are built in with the functionalities of forward error control (FEC), channel estimation, and equalization
Atmospheric hydroxyl radical (OH) abundances from ground-based ultraviolet solar spectra: an improved retrieval method
The Fourier Transform Ultraviolet Spectrometer (FTUVS) instrument has recorded a long-term data record of the atmospheric column abundance of the hydroxyl radical (OH) using the technique of high resolution solar absorption spectroscopy. We report new efforts in improving the precision of the OH measurements in order to better model the diurnal, seasonal, and interannual variability of odd hydrogen (HOx) chemistry in the stratosphere, which, in turn, will improve our understanding of ozone chemistry and its long-term changes. Until the present, the retrieval method has used a single strong OH absorption line P1(1) in the near-ultraviolet at 32,341 cm−1. We describe a new method that uses an average based on spectral fits to multiple lines weighted by line strength and fitting precision. We have also made a number of improvements in the ability to fit a model to the spectral feature, which substantially reduces the scatter in the measurements of OH abundances
Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal Retrieval
Current research on cross-modal retrieval is mostly English-oriented, as the
availability of a large number of English-oriented human-labeled
vision-language corpora. In order to break the limit of non-English labeled
data, cross-lingual cross-modal retrieval (CCR) has attracted increasing
attention. Most CCR methods construct pseudo-parallel vision-language corpora
via Machine Translation (MT) to achieve cross-lingual transfer. However, the
translated sentences from MT are generally imperfect in describing the
corresponding visual contents. Improperly assuming the pseudo-parallel data are
correctly correlated will make the networks overfit to the noisy
correspondence. Therefore, we propose Dual-view Curricular Optimal Transport
(DCOT) to learn with noisy correspondence in CCR. In particular, we quantify
the confidence of the sample pair correlation with optimal transport theory
from both the cross-lingual and cross-modal views, and design dual-view
curriculum learning to dynamically model the transportation costs according to
the learning stage of the two views. Extensive experiments are conducted on two
multilingual image-text datasets and one video-text dataset, and the results
demonstrate the effectiveness and robustness of the proposed method. Besides,
our proposed method also shows a good expansibility to cross-lingual image-text
baselines and a decent generalization on out-of-domain data
General Greedy De-bias Learning
Neural networks often make predictions relying on the spurious correlations
from the datasets rather than the intrinsic properties of the task of interest,
facing sharp degradation on out-of-distribution (OOD) test data. Existing
de-bias learning frameworks try to capture specific dataset bias by annotations
but they fail to handle complicated OOD scenarios. Others implicitly identify
the dataset bias by special design low capability biased models or losses, but
they degrade when the training and testing data are from the same distribution.
In this paper, we propose a General Greedy De-bias learning framework (GGD),
which greedily trains the biased models and the base model. The base model is
encouraged to focus on examples that are hard to solve with biased models, thus
remaining robust against spurious correlations in the test stage. GGD largely
improves models' OOD generalization ability on various tasks, but sometimes
over-estimates the bias level and degrades on the in-distribution test. We
further re-analyze the ensemble process of GGD and introduce the Curriculum
Regularization inspired by curriculum learning, which achieves a good trade-off
between in-distribution and out-of-distribution performance. Extensive
experiments on image classification, adversarial question answering, and visual
question answering demonstrate the effectiveness of our method. GGD can learn a
more robust base model under the settings of both task-specific biased models
with prior knowledge and self-ensemble biased model without prior knowledge.Comment: This work has been submitted to IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
- …