11,964 research outputs found
ODN: Opening the Deep Network for Open-set Action Recognition
In recent years, the performance of action recognition has been significantly
improved with the help of deep neural networks. Most of the existing action
recognition works hold the \textit{closed-set} assumption that all action
categories are known beforehand while deep networks can be well trained for
these categories. However, action recognition in the real world is essentially
an \textit{open-set} problem, namely, it is impossible to know all action
categories beforehand and consequently infeasible to prepare sufficient
training samples for those emerging categories. In this case, applying
closed-set recognition methods will definitely lead to unseen-category errors.
To address this challenge, we propose the Open Deep Network (ODN) for the
open-set action recognition task. Technologically, ODN detects new categories
by applying a multi-class triplet thresholding method, and then dynamically
reconstructs the classification layer and "opens" the deep network by adding
predictors for new categories continually. In order to transfer the learned
knowledge to the new category, two novel methods, Emphasis Initialization and
Allometry Training, are adopted to initialize and incrementally train the new
predictor so that only few samples are needed to fine-tune the model. Extensive
experiments show that ODN can effectively detect and recognize new categories
with little human intervention, thus applicable to the open-set action
recognition tasks in the real world. Moreover, ODN can even achieve comparable
performance to some closed-set methods.Comment: 6 pages, 3 figures, ICME 201
Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Recently, frequency domain all-neural beamforming methods have achieved
remarkable progress for multichannel speech separation. In parallel, the
integration of time domain network structure and beamforming also gains
significant attention. This study proposes a novel all-neural beamforming
method in time domain and makes an attempt to unify the all-neural beamforming
pipelines for time domain and frequency domain multichannel speech separation.
The proposed model consists of two modules: separation and beamforming. Both
modules perform temporal-spectral-spatial modeling and are trained from
end-to-end using a joint loss function. The novelty of this study lies in two
folds. Firstly, a time domain directional feature conditioned on the direction
of the target speaker is proposed, which can be jointly optimized within the
time domain architecture to enhance target signal estimation. Secondly, an
all-neural beamforming network in time domain is designed to refine the
pre-separated results. This module features with parametric time-variant
beamforming coefficient estimation, without explicitly following the derivation
of optimal filters that may lead to an upper bound. The proposed method is
evaluated on simulated reverberant overlapped speech data derived from the
AISHELL-1 corpus. Experimental results demonstrate significant performance
improvements over frequency domain state-of-the-arts, ideal magnitude masks and
existing time domain neural beamforming methods
Direct reduction and extraction of iron from nickel smelting slag coupling of preparation of cementing materials using gangue composition
Aiming at the properties of Fe and SiO2 in nickel slag, the process of preparing DRI by direct reduction nickel slag from coal base was proposed, and the component of gangue is used as raw material to prepare C2S(belite) and C3S(alite), which is a comprehensive utilization of nickel slag. The reduction reaction of iron coupling of the reaction of cementitious materials was realized through thermodynamic calculation and experiment. The reduction roasting products of nickel slag with iron, C3S and C2S as the main phase were obtained by reasonable batching and temperature control technology of reduction roasting reaction
Adaptive iterative working state prediction based on the double unscented transformation and dynamic functioning for unmanned aerial vehicle lithium-ion batteries.
In lithium-ion batteries, the accuracy of estimation of the state of charge is a core parameter which will determine the power control accuracy and management reliability of the energy storage systems. When using unscented Kalman filtering to estimate the charge of lithium-ion batteries, if the pulse current change rate is too high, the tracking effects of algorithms will not be optimal, with high estimation errors. In this study, the unscented Kalman filtering algorithm is improved to solve the above problems and boost the Kalman gain with dynamic function modules, so as to improve system stability. The closed-circuit voltage of the system is predicted with two non-linear transformations, so as to improve the accuracy of the system. Meanwhile, an adaptive algorithm is developed to predict and correct the system noises and observation noises, thus enhancing the robustness of the system. Experiments show that the maximum estimation error of the second-order Circuit Model is controlled to less than 0.20V. Under various simulation conditions and interference factors, the estimation error of the unscented Kalman filtering is as high as 2%, but that of the improved Kalman filtering algorithm are kept well under 1.00%, with the errors reduced by 0.80%, therefore laying a sound foundation for the follow-up research on the battery management system
- …