146 research outputs found

    Deep Virtual Networks for Memory Efficient Inference of Multiple Tasks

    Full text link
    Deep networks consume a large amount of memory by their nature. A natural question arises can we reduce that memory requirement whilst maintaining performance. In particular, in this work we address the problem of memory efficient learning for multiple tasks. To this end, we propose a novel network architecture producing multiple networks of different configurations, termed deep virtual networks (DVNs), for different tasks. Each DVN is specialized for a single task and structured hierarchically. The hierarchical structure, which contains multiple levels of hierarchy corresponding to different numbers of parameters, enables multiple inference for different memory budgets. The building block of a deep virtual network is based on a disjoint collection of parameters of a network, which we call a unit. The lowest level of hierarchy in a deep virtual network is a unit, and higher levels of hierarchy contain lower levels' units and other additional units. Given a budget on the number of parameters, a different level of a deep virtual network can be chosen to perform the task. A unit can be shared by different DVNs, allowing multiple DVNs in a single network. In addition, shared units provide assistance to the target task with additional knowledge learned from another tasks. This cooperative configuration of DVNs makes it possible to handle different tasks in a memory-aware manner. Our experiments show that the proposed method outperforms existing approaches for multiple tasks. Notably, ours is more efficient than others as it allows memory-aware inference for all tasks.Comment: CVPR 201

    Deep Elastic Networks with Model Selection for Multi-Task Learning

    Full text link
    In this work, we consider the problem of instance-wise dynamic network model selection for multi-task learning. To this end, we propose an efficient approach to exploit a compact but accurate model in a backbone architecture for each instance of all tasks. The proposed method consists of an estimator and a selector. The estimator is based on a backbone architecture and structured hierarchically. It can produce multiple different network models of different configurations in a hierarchical structure. The selector chooses a model dynamically from a pool of candidate models given an input instance. The selector is a relatively small-size network consisting of a few layers, which estimates a probability distribution over the candidate models when an input instance of a task is given. Both estimator and selector are jointly trained in a unified learning framework in conjunction with a sampling-based learning strategy, without additional computation steps. We demonstrate the proposed approach for several image classification tasks compared to existing approaches performing model selection or learning multiple tasks. Experimental results show that our approach gives not only outstanding performance compared to other competitors but also the versatility to perform instance-wise model selection for multiple tasks.Comment: ICCV 201

    Learning to Discriminate Information for Online Action Detection

    Full text link
    From a streaming video, online action detection aims to identify actions in the present. For this task, previous methods use recurrent networks to model the temporal sequence of current action frames. However, these methods overlook the fact that an input image sequence includes background and irrelevant actions as well as the action of interest. For online action detection, in this paper, we propose a novel recurrent unit to explicitly discriminate the information relevant to an ongoing action from others. Our unit, named Information Discrimination Unit (IDU), decides whether to accumulate input information based on its relevance to the current action. This enables our recurrent network with IDU to learn a more discriminative representation for identifying ongoing actions. In experiments on two benchmark datasets, TVSeries and THUMOS-14, the proposed method outperforms state-of-the-art methods by a significant margin. Moreover, we demonstrate the effectiveness of our recurrent unit by conducting comprehensive ablation studies.Comment: To appear in CVPR 202

    Data Diversification Analysis on Data Preprocessing

    Get PDF
    A statistical analysis to examine the diversity distribution resulting from two different approaches: The first one, the standard approach, is a baseline augmentation approach where a random augmentation is applied to each sample in each epoch independently; The second one, the random batch approach, is another new augmentation approach designed where a random augmentation is applied to each tiny-batch in each epoch independently, and which samples are in the same tiny-batch is random and independent across all epochs

    A broadband X-ray study of the Rabbit pulsar wind nebula powered by PSR J1418-6058

    Full text link
    We report on broadband X-ray properties of the Rabbit pulsar wind nebula (PWN) associated with the pulsar PSR J1418-6058 using archival Chandra and XMM-Newton data, and a new NuSTAR observation. NuSTAR data above 10 keV allowed us to detect the 110-ms spin period of the pulsar, characterize its hard X-ray pulse profile, and resolve hard X-ray emission from the PWN after removing contamination from the pulsar and other overlapping point sources. The extended PWN was detected up to \sim20 keV and is well described by a power-law model with a photon index Γ\Gamma\approx2. The PWN shape does not vary significantly with energy, and its X-ray spectrum shows no clear evidence of softening away from the pulsar. We modeled the spatial profile of X-ray spectra and broadband spectral energy distribution in the radio to TeV band to infer the physical properties of the PWN. We found that a model with low magnetic field strength (B10B\sim 10 μ\muG) and efficient diffusion (D1027D\sim 10^{27} cm2^2 s1^{-1}) fits the PWN data well. The extended hard X-ray and TeV emission, associated respectively with synchrotron radiation and inverse Compton scattering by relativistic electrons, suggests that particles are accelerated to very high energies (500\gtrsim500 TeV), indicating that the Rabbit PWN is a Galactic PeVatron candidate.Comment: 21 pages, 10 figures. ApJ accepte
    corecore