11,964 research outputs found

    ODN: Opening the Deep Network for Open-set Action Recognition

    Full text link
    In recent years, the performance of action recognition has been significantly improved with the help of deep neural networks. Most of the existing action recognition works hold the \textit{closed-set} assumption that all action categories are known beforehand while deep networks can be well trained for these categories. However, action recognition in the real world is essentially an \textit{open-set} problem, namely, it is impossible to know all action categories beforehand and consequently infeasible to prepare sufficient training samples for those emerging categories. In this case, applying closed-set recognition methods will definitely lead to unseen-category errors. To address this challenge, we propose the Open Deep Network (ODN) for the open-set action recognition task. Technologically, ODN detects new categories by applying a multi-class triplet thresholding method, and then dynamically reconstructs the classification layer and "opens" the deep network by adding predictors for new categories continually. In order to transfer the learned knowledge to the new category, two novel methods, Emphasis Initialization and Allometry Training, are adopted to initialize and incrementally train the new predictor so that only few samples are needed to fine-tune the model. Extensive experiments show that ODN can effectively detect and recognize new categories with little human intervention, thus applicable to the open-set action recognition tasks in the real world. Moreover, ODN can even achieve comparable performance to some closed-set methods.Comment: 6 pages, 3 figures, ICME 201

    Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

    Full text link
    Recently, frequency domain all-neural beamforming methods have achieved remarkable progress for multichannel speech separation. In parallel, the integration of time domain network structure and beamforming also gains significant attention. This study proposes a novel all-neural beamforming method in time domain and makes an attempt to unify the all-neural beamforming pipelines for time domain and frequency domain multichannel speech separation. The proposed model consists of two modules: separation and beamforming. Both modules perform temporal-spectral-spatial modeling and are trained from end-to-end using a joint loss function. The novelty of this study lies in two folds. Firstly, a time domain directional feature conditioned on the direction of the target speaker is proposed, which can be jointly optimized within the time domain architecture to enhance target signal estimation. Secondly, an all-neural beamforming network in time domain is designed to refine the pre-separated results. This module features with parametric time-variant beamforming coefficient estimation, without explicitly following the derivation of optimal filters that may lead to an upper bound. The proposed method is evaluated on simulated reverberant overlapped speech data derived from the AISHELL-1 corpus. Experimental results demonstrate significant performance improvements over frequency domain state-of-the-arts, ideal magnitude masks and existing time domain neural beamforming methods

    Integer-Valued Moving Average Models with Structural Changes

    Get PDF

    Direct reduction and extraction of iron from nickel smelting slag coupling of preparation of cementing materials using gangue composition

    Get PDF
    Aiming at the properties of Fe and SiO2 in nickel slag, the process of preparing DRI by direct reduction nickel slag from coal base was proposed, and the component of gangue is used as raw material to prepare C2S(belite) and C3S(alite), which is a comprehensive utilization of nickel slag. The reduction reaction of iron coupling of the reaction of cementitious materials was realized through thermodynamic calculation and experiment. The reduction roasting products of nickel slag with iron, C3S and C2S as the main phase were obtained by reasonable batching and temperature control technology of reduction roasting reaction

    Adaptive iterative working state prediction based on the double unscented transformation and dynamic functioning for unmanned aerial vehicle lithium-ion batteries.

    Get PDF
    In lithium-ion batteries, the accuracy of estimation of the state of charge is a core parameter which will determine the power control accuracy and management reliability of the energy storage systems. When using unscented Kalman filtering to estimate the charge of lithium-ion batteries, if the pulse current change rate is too high, the tracking effects of algorithms will not be optimal, with high estimation errors. In this study, the unscented Kalman filtering algorithm is improved to solve the above problems and boost the Kalman gain with dynamic function modules, so as to improve system stability. The closed-circuit voltage of the system is predicted with two non-linear transformations, so as to improve the accuracy of the system. Meanwhile, an adaptive algorithm is developed to predict and correct the system noises and observation noises, thus enhancing the robustness of the system. Experiments show that the maximum estimation error of the second-order Circuit Model is controlled to less than 0.20V. Under various simulation conditions and interference factors, the estimation error of the unscented Kalman filtering is as high as 2%, but that of the improved Kalman filtering algorithm are kept well under 1.00%, with the errors reduced by 0.80%, therefore laying a sound foundation for the follow-up research on the battery management system
    corecore