835 research outputs found

    F3Net: Fusion, Feedback and Focus for Salient Object Detection

    Full text link
    Most of existing salient object detection models have achieved great progress by aggregating multi-level features extracted from convolutional neural networks. However, because of the different receptive fields of different convolutional layers, there exists big differences between features generated by these layers. Common feature fusion strategies (addition or concatenation) ignore these differences and may cause suboptimal solutions. In this paper, we propose the F3Net to solve above problem, which mainly consists of cross feature module (CFM) and cascaded feedback decoder (CFD) trained by minimizing a new pixel position aware loss (PPA). Specifically, CFM aims to selectively aggregate multi-level features. Different from addition and concatenation, CFM adaptively selects complementary components from input features before fusion, which can effectively avoid introducing too much redundant information that may destroy the original features. Besides, CFD adopts a multi-stage feedback mechanism, where features closed to supervision will be introduced to the output of previous layers to supplement them and eliminate the differences between features. These refined features will go through multiple similar iterations before generating the final saliency maps. Furthermore, different from binary cross entropy, the proposed PPA loss doesn't treat pixels equally, which can synthesize the local structure information of a pixel to guide the network to focus more on local details. Hard pixels from boundaries or error-prone parts will be given more attention to emphasize their importance. F3Net is able to segment salient object regions accurately and provide clear local details. Comprehensive experiments on five benchmark datasets demonstrate that F3Net outperforms state-of-the-art approaches on six evaluation metrics.Comment: Accepted by AAAI2020, https://github.com/weijun88/F3Ne

    The Euclidean Space is Evil: Hyperbolic Attribute Editing for Few-shot Image Generation

    Full text link
    Few-shot image generation is a challenging task since it aims to generate diverse new images for an unseen category with only a few images. Existing methods suffer from the trade-off between the quality and diversity of generated images. To tackle this problem, we propose Hyperbolic Attribute Editing (HAE), a simple yet effective method. Unlike other methods that work in Euclidean space, HAE captures the hierarchy among images using data from seen categories in hyperbolic space. Given a well-trained HAE, images of unseen categories can be generated by moving the latent code of a given image toward any meaningful directions in the Poincar\'e disk with a fixing radius. Most importantly, the hyperbolic space allows us to control the semantic diversity of the generated images by setting different radii in the disk. Extensive experiments and visualizations demonstrate that HAE is capable of not only generating images with promising quality and diversity using limited data but achieving a highly controllable and interpretable editing process

    Development of Metamaterial EBG Absorbers for Application of Wireless Inter/Intrachip Communication Systems

    Get PDF
    First, the chapter presents a novel design of electromagnetic bandgap (EBG) absorber with the characteristics of broad bandwidth, low profile, and polarization‐independence to a normal incident electromagnetic wave. The absorber is composed of three consecutive octagon or decagon loops, and highly‐resistive frequency selective surface (FSS) layers. Second, based on the feature of the designed absorber unit, a broadband, metamaterial absorber‐bounded, wireless inter/intrachip (WIIC) communication channel is constructed at the center frequency of 60 GHz. Third, in order to validate the developed methodology used in WIIC analysis, a wired channel on a conventional PCB has been measured, simulated, and analyzed. Fourth, with the extracted S‐parameters of the WIIC system and wired PCB channel, the system impulse responses and transfer functions of the investigated channels have been further extracted, which are used for validation and BER analysis of the WIIC system. Finally, it has been shown that based on the derived BER results, the performance of the designed WIIC channel is close to that of an additive Gaussian white noise (AWGN) channel when the WIIC transceivers are built in with the functionalities of forward error control (FEC), channel estimation, and equalization

    Atmospheric hydroxyl radical (OH) abundances from ground-based ultraviolet solar spectra: an improved retrieval method

    Get PDF
    The Fourier Transform Ultraviolet Spectrometer (FTUVS) instrument has recorded a long-term data record of the atmospheric column abundance of the hydroxyl radical (OH) using the technique of high resolution solar absorption spectroscopy. We report new efforts in improving the precision of the OH measurements in order to better model the diurnal, seasonal, and interannual variability of odd hydrogen (HOx) chemistry in the stratosphere, which, in turn, will improve our understanding of ozone chemistry and its long-term changes. Until the present, the retrieval method has used a single strong OH absorption line P1(1) in the near-ultraviolet at 32,341 cm−1. We describe a new method that uses an average based on spectral fits to multiple lines weighted by line strength and fitting precision. We have also made a number of improvements in the ability to fit a model to the spectral feature, which substantially reduces the scatter in the measurements of OH abundances

    Dual-view Curricular Optimal Transport for Cross-lingual Cross-modal Retrieval

    Full text link
    Current research on cross-modal retrieval is mostly English-oriented, as the availability of a large number of English-oriented human-labeled vision-language corpora. In order to break the limit of non-English labeled data, cross-lingual cross-modal retrieval (CCR) has attracted increasing attention. Most CCR methods construct pseudo-parallel vision-language corpora via Machine Translation (MT) to achieve cross-lingual transfer. However, the translated sentences from MT are generally imperfect in describing the corresponding visual contents. Improperly assuming the pseudo-parallel data are correctly correlated will make the networks overfit to the noisy correspondence. Therefore, we propose Dual-view Curricular Optimal Transport (DCOT) to learn with noisy correspondence in CCR. In particular, we quantify the confidence of the sample pair correlation with optimal transport theory from both the cross-lingual and cross-modal views, and design dual-view curriculum learning to dynamically model the transportation costs according to the learning stage of the two views. Extensive experiments are conducted on two multilingual image-text datasets and one video-text dataset, and the results demonstrate the effectiveness and robustness of the proposed method. Besides, our proposed method also shows a good expansibility to cross-lingual image-text baselines and a decent generalization on out-of-domain data

    General Greedy De-bias Learning

    Full text link
    Neural networks often make predictions relying on the spurious correlations from the datasets rather than the intrinsic properties of the task of interest, facing sharp degradation on out-of-distribution (OOD) test data. Existing de-bias learning frameworks try to capture specific dataset bias by annotations but they fail to handle complicated OOD scenarios. Others implicitly identify the dataset bias by special design low capability biased models or losses, but they degrade when the training and testing data are from the same distribution. In this paper, we propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model. The base model is encouraged to focus on examples that are hard to solve with biased models, thus remaining robust against spurious correlations in the test stage. GGD largely improves models' OOD generalization ability on various tasks, but sometimes over-estimates the bias level and degrades on the in-distribution test. We further re-analyze the ensemble process of GGD and introduce the Curriculum Regularization inspired by curriculum learning, which achieves a good trade-off between in-distribution and out-of-distribution performance. Extensive experiments on image classification, adversarial question answering, and visual question answering demonstrate the effectiveness of our method. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.Comment: This work has been submitted to IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl
    corecore