149 research outputs found

    Learning to Segment Every Thing

    Full text link
    Most methods for object instance segmentation require all training examples to be labeled with segmentation masks. This requirement makes it expensive to annotate new categories and has restricted instance segmentation models to ~100 well-annotated classes. The goal of this paper is to propose a new partially supervised training paradigm, together with a novel weight transfer function, that enables training instance segmentation models on a large set of categories all of which have box annotations, but only a small fraction of which have mask annotations. These contributions allow us to train Mask R-CNN to detect and segment 3000 visual concepts using box annotations from the Visual Genome dataset and mask annotations from the 80 classes in the COCO dataset. We evaluate our approach in a controlled study on the COCO dataset. This work is a first step towards instance segmentation models that have broad comprehension of the visual world

    Fully Point-wise Convolutional Neural Network for Modeling Statistical Regularities in Natural Images

    Full text link
    Modeling statistical regularity plays an essential role in ill-posed image processing problems. Recently, deep learning based methods have been presented to implicitly learn statistical representation of pixel distributions in natural images and leverage it as a constraint to facilitate subsequent tasks, such as color constancy and image dehazing. However, the existing CNN architecture is prone to variability and diversity of pixel intensity within and between local regions, which may result in inaccurate statistical representation. To address this problem, this paper presents a novel fully point-wise CNN architecture for modeling statistical regularities in natural images. Specifically, we propose to randomly shuffle the pixels in the origin images and leverage the shuffled image as input to make CNN more concerned with the statistical properties. Moreover, since the pixels in the shuffled image are independent identically distributed, we can replace all the large convolution kernels in CNN with point-wise (111*1) convolution kernels while maintaining the representation ability. Experimental results on two applications: color constancy and image dehazing, demonstrate the superiority of our proposed network over the existing architectures, i.e., using 1/10\sim1/100 network parameters and computational cost while achieving comparable performance.Comment: 9 pages, 7 figures. To appear in ACM MM 201

    Refinement of retained austenite in super-bainitic steel by a deep cryogenic treatment

    Get PDF
    The effect of a deep cryogenic treatment on the microstructure of a super-bainitic steel was investigated. It was shown that quenching the super-bainitc steel in –196°C liquid nitrogen resulted in the transformation of retained austenite to two phases: ~20 nm thick martensite films and some nano carbides with a ~25 nm diameter. Some refinement of the retained austenite occurred, due to formation of fine martensite laths within the retained austenite. The evolution of these new phases resulted in an increase in the average hardness of the super-bainitic steel from 641 to ~670 HV1

    Temporal Cross-Media Retrieval with Soft-Smoothing

    Full text link
    Multimedia information have strong temporal correlations that shape the way modalities co-occur over time. In this paper we study the dynamic nature of multimedia and social-media information, where the temporal dimension emerges as a strong source of evidence for learning the temporal correlations across visual and textual modalities. So far, cross-media retrieval models, explored the correlations between different modalities (e.g. text and image) to learn a common subspace, in which semantically similar instances lie in the same neighbourhood. Building on such knowledge, we propose a novel temporal cross-media neural architecture, that departs from standard cross-media methods, by explicitly accounting for the temporal dimension through temporal subspace learning. The model is softly-constrained with temporal and inter-modality constraints that guide the new subspace learning task by favouring temporal correlations between semantically similar and temporally close instances. Experiments on three distinct datasets show that accounting for time turns out to be important for cross-media retrieval. Namely, the proposed method outperforms a set of baselines on the task of temporal cross-media retrieval, demonstrating its effectiveness for performing temporal subspace learning.Comment: To appear in ACM MM 201

    Application of Convolutional Recurrent Neural Network for Individual Recognition Based on Resting State fMRI Data

    Get PDF
    In most task and resting state fMRI studies, a group consensus is often sought, where individual variability is considered a nuisance. None the less, biological variability is an important factor that cannot be ignored and is gaining more attention in the field. One recent development is the individual identification based on static functional connectome. While the original work was based on the static connectome, subsequent efforts using recurrent neural networks (RNN) demonstrated that the inclusion of temporal features greatly improved identification accuracy. Given that convolutional RNN (ConvRNN) seamlessly integrates spatial and temporal features, the present work applied ConvRNN for individual identification with resting state fMRI data. Our result demonstrates ConvRNN achieving a higher identification accuracy than conventional RNN, likely due to better extraction of local features between neighboring ROIs. Furthermore, given that each convolutional output assembles in-place features, they provide a natural way for us to visualize the informative spatial pattern and temporal information, opening up a promising new avenue for analyzing fMRI data

    Understanding the Eastward Shift and Intensification of the ENSO Teleconnection Over South Pacific and Antarctica Under Greenhouse Warming

    Get PDF
    The Pacific–South America (PSA) teleconnection pattern triggered by the El Niño/Southern Oscillation (ENSO) is suggested to be moving eastward and intensifying under global warming. However, the underlying mechanism is not completely understood. Previous studies have proposed that the movement of the PSA teleconnection pattern is attributable to the eastward shift of the tropical Pacific ENSO-driven rainfall anomalies in response to the projected El Niño-like sea surface temperature (SST) warming pattern. In this study, we found that with uniform warming, models will also simulate an eastward movement of the PSA teleconnection pattern, without the impact of the uneven SST warming pattern. Further investigation reveals that future changes in the climatology of the atmospheric circulation, particularly the movement of the exit region of the subtropical jet stream, can also contribute to the eastward shift of the PSA teleconnection pattern by modifying the conversion of mean kinetic energy to eddy kinetic energy

    Strong enhancement of photoresponsivity with shrinking the electrodes spacing in few layer GaSe photodetectors

    Full text link
    A critical challenge for the integration of the optoelectronics is that photodetectors have relatively poor sensitivities at the nanometer scale. It is generally believed that a large electrodes spacing in photodetectors is required to absorb sufficient light to maintain high photoresponsivity and reduce the dark current. However, this will limit the optoelectronic integration density. Through spatially resolved photocurrent investigation, we find that the photocurrent in metal-semiconductor-metal (MSM) photodetectors based on layered GaSe is mainly generated from the photoexcited carriers close to the metal-GaSe interface and the photocurrent active region is always close to the Schottky barrier with higher electrical potential. The photoresponsivity monotonically increases with shrinking the spacing distance before the direct tunneling happen, which was significantly enhanced up to 5,000 AW-1 for the bottom contacted device at bias voltage 8 V and wavelength of 410 nm. It is more than 1,700-fold improvement over the previously reported results. Besides the systematically experimental investigation of the dependence of the photoresponsivity on the spacing distance for both the bottom and top contacted MSM photodetectors, a theoretical model has also been developed to well explain the photoresponsivity for these two types of device configurations. Our findings realize shrinking the spacing distance and improving the performance of 2D semiconductor based MSM photodetectors simultaneously, which could pave the way for future high density integration of 2D semiconductor optoelectronics with high performances.Comment: 25 pages, 4 figure

    Excessively tilted fiber grating based Fe3O4 saturable absorber for passively mode-locked fiber laser

    Get PDF
    A novel approach to saturable absorber (SA) formation is presented by taking advantage of the mode coupling property of excessively tilted fiber grating (Ex-TFG). Stable mode-locked operation can be conveniently achieved based on the interaction between Ex- TFG coupled light and deposited ferroferric-oxide (Fe3O4) nanoparticles. The central wavelength, bandwidth and single pulse duration of the output are 1595 nm, 4.05 nm, and 912 fs, respectively. The fiber laser exhibits good long-term stability with signal-to-noise ratio (SNR) of 67 dB. For the first time, to the best of our knowledge, Ex-TFG based Fe3O4 SA for mode-locked fiber laser is demonstrated

    Multi-Label Image Classification via Knowledge Distillation from Weakly-Supervised Detection

    Full text link
    Multi-label image classification is a fundamental but challenging task towards general visual understanding. Existing methods found the region-level cues (e.g., features from RoIs) can facilitate multi-label classification. Nevertheless, such methods usually require laborious object-level annotations (i.e., object labels and bounding boxes) for effective learning of the object-level visual features. In this paper, we propose a novel and efficient deep framework to boost multi-label classification by distilling knowledge from weakly-supervised detection task without bounding box annotations. Specifically, given the image-level annotations, (1) we first develop a weakly-supervised detection (WSD) model, and then (2) construct an end-to-end multi-label image classification framework augmented by a knowledge distillation module that guides the classification model by the WSD model according to the class-level predictions for the whole image and the object-level visual features for object RoIs. The WSD model is the teacher model and the classification model is the student model. After this cross-task knowledge distillation, the performance of the classification model is significantly improved and the efficiency is maintained since the WSD model can be safely discarded in the test phase. Extensive experiments on two large-scale datasets (MS-COCO and NUS-WIDE) show that our framework achieves superior performances over the state-of-the-art methods on both performance and efficiency.Comment: accepted by ACM Multimedia 2018, 9 pages, 4 figures, 5 table

    Case report: A novel case of COVID-19 triggered tumefactive demyelinating lesions in one multiple sclerosis patient

    Get PDF
    The epidemic of COVID-19 is mainly manifested by respiratory symptoms caused by SARS-CoV-2 infection. Recently, reports of central nervous system diseases caused or aggravated by SARS-CoV-2 infection are also increasing. Thus, the COVID-19 pandemic poses an unprecedented challenge to the diagnosis and management of neurological disorders, especially to those diseases which have overlapping clinical and radiologic features with each other. In this study, a 31-year-old female patient had been diagnosed with relapsing–remitting multiple sclerosis (RRMS) initially and subsequently developed tumefactive demyelinating lesions (TDLs) following an infection with SARS-CoV-2. After immunotherapy (glucocorticoid pulses), a significant improvement was observed in her both clinical and radiological characteristics. The patient was started on disease-modifying therapy (DMT) with teriflunomide after cessation of oral glucocorticoids. Following two months of DMT treatment, the imaging follow-up revealed that the patient’s condition continued to deteriorate. This case was characterized by the transformation of a multiple sclerosis patient (MS) infected with SARS-CoV-2 into TDLs and the ineffectiveness of DMT treatment, which added complexity to its diagnosis and treatment. The case also gave us a hint that SARS-CoV-2 has a potential contributory role in inducing or exacerbating demyelinating diseases of the central nervous system that warrants further investigation
    corecore