5 research outputs found

    Joint DOA and Multi-Pitch Estimation Using Block Sparsity

    Get PDF
    In this paper, we propose a novel method to estimate the fundamental frequencies and directions-of-arrival (DOA) of multi-pitch signals impinging on a sensor array. Formulating the estimation as a group sparse convex optimization problem, we use the alternating direction of multipliers method (ADMM) to estimate both temporal and spatial correlation of the array signal. By first jointly estimating both fundamental frequencies and time-of-arrivals (TOAs) for each sensor and sound source, we then form a non-linear least squares estimate to obtain the DOAs. Numerical simulations indcate the preferable performance of the proposed estimator as compared to current state-of-the-art methods

    Multi-Pitch Estimation Exploiting Block Sparsity

    Get PDF
    We study the problem of estimating the fundamental frequencies of a signal containing multiple harmonically related sinusoidal components using a novel block sparse signal representation. An efficient algorithm for solving the resulting optimization problem is devised exploiting a novel variable step-size alternating direction method of multipliers (ADMM). The resulting algorithm has guaranteed convergence and shows notable robustness to the f 0 vs f0/2f0/2 ambiguity problem. The superiority of the proposed method, as compared to earlier presented estimation techniques, is demonstrated using both simulated and measured audio signals, clearly indicating the preferable performance of the proposed technique

    A Parametric Method for Multi-Pitch Estimation

    Get PDF
    This thesis proposes a novel method for multi-pitch estimation. The method operates by posing pitch estimation as a sparse recovery problem which is solved using convex optimization techniques. In that respect, it is an extension of an earlier presented estimation method based on the group-LASSO. However, by introducing an adaptive total variation penalty, the proposed method requires fewer user supplied parameters, thereby simplifying the estimation procedure. The method is shown to have comparable to superior performance in low noise environments when compared to three standard multi-pitch estimation methods as well as the predecessor method. Also presented is a scheme for automatic selection of the regularization parameters, thereby making the method more user friendly. Used together with this scheme, the proposed method is shown to yield accurate, although not statistically efficent, pitch Estimates when evaluated on synthetic speech data

    Sparse Modeling of Grouped Line Spectra

    Get PDF
    This licentiate thesis focuses on clustered parametric models for estimation of line spectra, when the spectral content of a signal source is assumed to exhibit some form of grouping. Different from previous parametric approaches, which generally require explicit knowledge of the model orders, this thesis exploits sparse modeling, where the orders are implicitly chosen. For line spectra, the non-linear parametric model is approximated by a linear system, containing an overcomplete basis of candidate frequencies, called a dictionary, and a large set of linear response variables that selects and weights the components in the dictionary. Frequency estimates are obtained by solving a convex optimization program, where the sum of squared residuals is minimized. To discourage overfitting and to infer certain structure in the solution, different convex penalty functions are introduced into the optimization. The cost trade-off between fit and penalty is set by some user parameters, as to approximate the true number of spectral lines in the signal, which implies that the response variable will be sparse, i.e., have few non-zero elements. Thus, instead of explicit model orders, the orders are implicitly set by this trade-off. For grouped variables, the dictionary is customized, and appropriate convex penalties selected, so that the solution becomes group sparse, i.e., has few groups with non-zero variables. In an array of sensors, the specific time-delays and attenuations will depend on the source and sensor positions. By modeling this, one may estimate the location of a source. In this thesis, a novel joint location and grouped frequency estimator is proposed, which exploits sparse modeling for both spectral and spatial estimates, showing robustness against sources with overlapping frequency content. For audio signals, this thesis uses two different features for clustering. Pitch is a perceptual property of sound that may be described by the harmonic model, i.e., by a group of spectral lines at integer multiples of a fundamental frequency, which we estimate by exploiting a novel adaptive total variation penalty. The other feature, chroma, is a concept in musical theory, collecting pitches at powers of 2 from each other into groups. Using a chroma dictionary, together with appropriate group sparse penalties, we propose an automatic transcription of the chroma content of a signal

    Enhancement of speech signals - with a focus on voiced speech models

    Get PDF
    corecore