7,535 research outputs found
A Double Joint Bayesian Approach for J-Vector Based Text-dependent Speaker Verification
J-vector has been proved to be very effective in text-dependent speaker
verification with short-duration speech. However, the current state-of-the-art
back-end classifiers, e.g. joint Bayesian model, cannot make full use of such
deep features. In this paper, we generalize the standard joint Bayesian
approach to model the multi-faceted information in the j-vector explicitly and
jointly. In our generalization, the j-vector was modeled as a result derived by
a generative Double Joint Bayesian (DoJoBa) model, which contains several kinds
of latent variables. With DoJoBa, we are able to explicitly build a model that
can combine multiple heterogeneous information from the j-vectors. In
verification step, we calculated the likelihood to describe whether the two
j-vectors having consistent labels or not. On the public RSR2015 data corpus,
the experimental results showed that our approach can achieve 0.02\% EER and
0.02\% EER for impostor wrong and impostor correct cases respectively
Substructure and Boundary Modeling for Continuous Action Recognition
This paper introduces a probabilistic graphical model for continuous action
recognition with two novel components: substructure transition model and
discriminative boundary model. The first component encodes the sparse and
global temporal transition prior between action primitives in state-space model
to handle the large spatial-temporal variations within an action class. The
second component enforces the action duration constraint in a discriminative
way to locate the transition boundaries between actions more accurately. The
two components are integrated into a unified graphical structure to enable
effective training and inference. Our comprehensive experimental results on
both public and in-house datasets show that, with the capability to incorporate
additional information that had not been explicitly or efficiently modeled by
previous methods, our proposed algorithm achieved significantly improved
performance for continuous action recognition.Comment: Detailed version of the CVPR 2012 paper. 15 pages, 6 figure
The NLMS algorithm with time-variant optimum stepsize derived from a Bayesian network perspective
In this article, we derive a new stepsize adaptation for the normalized least
mean square algorithm (NLMS) by describing the task of linear acoustic echo
cancellation from a Bayesian network perspective. Similar to the well-known
Kalman filter equations, we model the acoustic wave propagation from the
loudspeaker to the microphone by a latent state vector and define a linear
observation equation (to model the relation between the state vector and the
observation) as well as a linear process equation (to model the temporal
progress of the state vector). Based on additional assumptions on the
statistics of the random variables in observation and process equation, we
apply the expectation-maximization (EM) algorithm to derive an NLMS-like filter
adaptation. By exploiting the conditional independence rules for Bayesian
networks, we reveal that the resulting EM-NLMS algorithm has a stepsize update
equivalent to the optimal-stepsize calculation proposed by Yamamoto and
Kitayama in 1982, which has been adopted in many textbooks. As main difference,
the instantaneous stepsize value is estimated in the M step of the EM algorithm
(instead of being approximated by artificially extending the acoustic echo
path). The EM-NLMS algorithm is experimentally verified for synthesized
scenarios with both, white noise and male speech as input signal.Comment: 4 pages, 1 page of reference
- …