262 research outputs found
Direction of Arrival with One Microphone, a few LEGOs, and Non-Negative Matrix Factorization
Conventional approaches to sound source localization require at least two
microphones. It is known, however, that people with unilateral hearing loss can
also localize sounds. Monaural localization is possible thanks to the
scattering by the head, though it hinges on learning the spectra of the various
sources. We take inspiration from this human ability to propose algorithms for
accurate sound source localization using a single microphone embedded in an
arbitrary scattering structure. The structure modifies the frequency response
of the microphone in a direction-dependent way giving each direction a
signature. While knowing those signatures is sufficient to localize sources of
white noise, localizing speech is much more challenging: it is an ill-posed
inverse problem which we regularize by prior knowledge in the form of learned
non-negative dictionaries. We demonstrate a monaural speech localization
algorithm based on non-negative matrix factorization that does not depend on
sophisticated, designed scatterers. In fact, we show experimental results with
ad hoc scatterers made of LEGO bricks. Even with these rudimentary structures
we can accurately localize arbitrary speakers; that is, we do not need to learn
the dictionary for the particular speaker to be localized. Finally, we discuss
multi-source localization and the related limitations of our approach.Comment: This article has been accepted for publication in IEEE/ACM
Transactions on Audio, Speech, and Language processing (TASLP
Effective Discriminative Feature Selection with Non-trivial Solutions
Feature selection and feature transformation, the two main ways to reduce
dimensionality, are often presented separately. In this paper, a feature
selection method is proposed by combining the popular transformation based
dimensionality reduction method Linear Discriminant Analysis (LDA) and sparsity
regularization. We impose row sparsity on the transformation matrix of LDA
through -norm regularization to achieve feature selection, and
the resultant formulation optimizes for selecting the most discriminative
features and removing the redundant ones simultaneously. The formulation is
extended to the -norm regularized case: which is more likely to
offer better sparsity when . Thus the formulation is a better
approximation to the feature selection problem. An efficient algorithm is
developed to solve the -norm based optimization problem and it is
proved that the algorithm converges when . Systematical experiments
are conducted to understand the work of the proposed method. Promising
experimental results on various types of real-world data sets demonstrate the
effectiveness of our algorithm
Penalized wavelet monotone regression
In this paper we focus on nonparametric estimation of a constrained regression function using penalized wavelet regression techniques. This results into a convex op- timization problem under linear constraints. Necessary and sufficient conditions for existence of a unique solution are discussed. The estimator is easily obtained via the dual formulation of the optimization problem. In particular we investigate a penalized wavelet monotone regression estimator. We establish the rate of convergence of this estimator, and illustrate its finite sample performance via a simulation study. We also compare its performance with that of a recently proposed constrained estimator. An illustration to some real data is given
A New Sparse Representation Algorithm for 3D Human Pose Estimation
This paper addresses the problem of recovering 3D human pose from single 2D images using Sparse Representation. While recent Sparse Representation (SR) based 3D human pose estimation methods have attained promising results estimating human poses from single images, their performance depends on the availability of large labeled datasets. However, in many real world applications, accessing to sufficient labeled data may be expensive and/or time consuming, but it is relatively easy to acquire a large amount of unlabeled data. Moreover, all SR based 3D pose estimation methods only consider the information of the input feature space and they cannot utilize the information of the pose space. In this paper, we propose a new framework based on sparse representation for 3D human pose estimation which uses both the labeled and unlabeled data. Furthermore, the proposed method can exploit the information of the pose space to improve the pose estimation accuracy. Experimental results show that the performance of the proposed method is significantly better than the state of the art 3D human pose estimation methods
- …