21 research outputs found
Sparse Modeling of Grouped Line Spectra
This licentiate thesis focuses on clustered parametric models for estimation of line spectra, when the spectral content of a signal source is assumed to exhibit some form of grouping. Different from previous parametric approaches, which generally require explicit knowledge of the model orders, this thesis exploits sparse modeling, where the orders are implicitly chosen. For line spectra, the non-linear parametric model is approximated by a linear system, containing an overcomplete basis of candidate frequencies, called a dictionary, and a large set of linear response variables that selects and weights the components in the dictionary. Frequency estimates are obtained by solving a convex optimization program, where the sum of squared residuals is minimized. To discourage overfitting and to infer certain structure in the solution, different convex penalty functions are introduced into the optimization. The cost trade-off between fit and penalty is set by some user parameters, as to approximate the true number of spectral lines in the signal, which implies that the response variable will be sparse, i.e., have few non-zero elements. Thus, instead of explicit model orders, the orders are implicitly set by this trade-off. For grouped variables, the dictionary is customized, and appropriate convex penalties selected, so that the solution becomes group sparse, i.e., has few groups with non-zero variables. In an array of sensors, the specific time-delays and attenuations will depend on the source and sensor positions. By modeling this, one may estimate the location of a source. In this thesis, a novel joint location and grouped frequency estimator is proposed, which exploits sparse modeling for both spectral and spatial estimates, showing robustness against sources with overlapping frequency content. For audio signals, this thesis uses two different features for clustering. Pitch is a perceptual property of sound that may be described by the harmonic model, i.e., by a group of spectral lines at integer multiples of a fundamental frequency, which we estimate by exploiting a novel adaptive total variation penalty. The other feature, chroma, is a concept in musical theory, collecting pitches at powers of 2 from each other into groups. Using a chroma dictionary, together with appropriate group sparse penalties, we propose an automatic transcription of the chroma content of a signal
Joint DOA and Multi-Pitch Estimation Using Block Sparsity
In this paper, we propose a novel method to estimate the fundamental frequencies and directions-of-arrival (DOA) of multi-pitch signals impinging on a sensor array. Formulating the estimation as a group sparse convex optimization problem, we use the alternating direction of multipliers method (ADMM) to estimate both temporal and spatial correlation of the array signal. By first jointly estimating both fundamental frequencies and time-of-arrivals (TOAs) for each sensor and sound source, we then form a non-linear least squares estimate to obtain the DOAs. Numerical simulations indcate the preferable performance of the proposed estimator as compared to current state-of-the-art methods
Sparse Chroma Estimation for Harmonic Audio
This work treats the estimation of the chromagram for harmonic audio signals using a block sparse reconstruction framework. Chroma has been used for decades as a key tool in audio analysis, and is typically formed using a Fourier-based framework that maps the fundamental frequency of a musical tone to its corresponding chroma. Such an approach often leads to problems with tone ambiguity, which we avoid by taking into account the harmonic structure and perceptional attributes in music. The performance of the proposed method is evaluated using real audio files, clearly showing preferable performance as compared to other commonly used methods
An Adaptive Penalty Approach to Multi-Pitch Estimation
This work treats multi-pitch estimation, and in particular the common misclassification issue wherein the pitch at half of the true fundamental frequency, here referred to as a sub-octave, is chosen instead of the true pitch. Extending on current methods which use an extension of the Group LASSO for pitch estimation, this work introduces an adaptive total variation penalty, which both enforce group- and block sparsity, and deal with errors due to sub-octaves. The method is shown to outperform current state-of-the-art sparse methods, where the model orders are unknown, while also requiring fewer tuning parameters than these. The method is also shown to outperform several conventional pitch estimation methods, even when these are virtued with oracle model orders
Sparse Multi-Pitch and Panning Estimation of Stereophonic Signals
In this paper, we propose a novel multi-pitch estimator for stereophonic mixtures, allowing for pitch estimation on multi-channel audio even if the amplitude and delay panning parameters are unknown. The presented method does not require prior knowledge of the number of sources present in the mixture, nor on the number of harmonics in each source. The estimator is formulated using a sparse signal framework, and an efficient implementation using the ADMM is introduced. Numerical simulations indicate the preferable performance of the proposed method as compared to several commonly used multi-channel single pitch estimators, and a commonly used multi-pitch estimator
Group-Sparse Regression : With Applications in Spectral Analysis and Audio Signal Processing
This doctorate thesis focuses on sparse regression, a statistical modeling tool for selecting valuable predictors in underdetermined linear models. By imposing different constraints on the structure of the variable vector in the regression problem, one obtains estimates which have sparse supports, i.e., where only a few of the elements in the response variable have non-zero values. The thesis collects six papers which, to a varying extent, deals with the applications, implementations, modifications, translations, and other analysis of such problems. Sparse regression is often used to approximate additive models with intricate, non-linear, non-smooth or otherwise problematic functions, by creating an underdetermined model consisting of candidate values for these functions, and linear response variables which selects among the candidates. Sparse regression is therefore a widely used tool in applications such as, e.g., image processing, audio processing, seismological and biomedical modeling, but is also frequently used for data mining applications such as, e.g., social network analytics, recommender systems, and other behavioral applications. Sparse regression is a subgroup of regularized regression problems, where a fitting term, often the sum of squared model residuals, is accompanied by a regularization term, which grows as the fit term shrinks, thereby trading off model fit for a sought sparsity pattern. Typically, the regression problems are formulated as convex optimization programs, a discipline in optimization where first-order conditions are sufficient for optimality, a local optima is also the global optima, and where numerical methods are abundant, approachable, and often very efficient. The main focus of this thesis is structured sparsity; where the linear predictors are clustered into groups, and sparsity is assumed to be correspondingly group-wise in the response variable. The first three papers in the thesis, A-C, concerns group-sparse regression for temporal identification and spatial localization, of different features in audio signal processing. In Paper A, we derive a model for audio signals recorded on an array of microphones, arbitrarily placed in a three-dimensional space. In a two-step group-sparse modeling procedure, we first identify and separate the recorded audio sources, and then localize their origins in space. In Paper B, we examine the multi-pitch model for tonal audio signals, such as, e.g., musical tones, tonal speech, or mechanical sounds from combustion engines. It typically models the signal-of-interest using a group of spectral lines, located at some integer multiple of a fundamental frequency. In this paper, we replace the regularizers used in previous works by a group-wise total variation function, promoting a smooth spectral envelope. The proposed combination of regularizers thereby avoids the common suboctave error, where the fundamental frequency is incorrectly classified using half of the fundamental frequency. In Paper C, we analyze the performance of group-sparse regression for classification by chroma, also known as pitch class, e.g., the musical note C, independent of the octave. The last three papers, D-F, are less application-specific than the first three; attempting to develop the methodology of sparse regression more independently of the application. Specifically, these papers look at model order selection in group-sparse regression, which is implicitly controlled by choosing a hyperparameter, prioritizing between the regularizer and the fitting term in the optimization problem. In Papers D and E, we examine a metric from array processing, termed the covariance fitting criterion, which is seemingly hyperparameter-free, and has been shown to yield sparse estimates for underdetermined linear systems. In the paper, we propose a generalization of the covariance fitting criterion for group-sparsity, and show how it relates to the group-sparse regression problem. In Paper F, we derive a novel method for hyperparameter-selection in sparse and group-sparse regression problems. By analyzing how the noise propagates into the parameter estimates, and the corresponding decision rules for sparsity, we propose selecting it as a quantile from the distribution of the maximum noise component, which we sample from using the Monte Carlo method
Hyperparameter-selection for sparse regression : A probablistic approach
The choice of hyperparameter(s) notably affects the support recovery in LASSO-like sparse regression problems, acting as an implicit model order selection. Parameters are typically selected using cross-validation or various ad hoc approaches. These often overestimates the resulting model order, aiming to minimize the prediction error rather than maximizing the support recovery. In this work, we propose a probabilistic approach to selecting hyperparameters in order to maximize the support recovery, quantifying the type I error (false positive rate) using extreme value analysis, such that the regularization level is selected as an appropriate quantile. By instead solving the scaled LASSO problem, the proposed choice of hyperparameter becomes almost independent of the noise variance. Simulation examples illustrate how the proposed method outperforms both cross-validation and the Bayesian Information Criterion in terms of computational complexity and support recovery
Hyperparameter Selection for Group-Sparse Regression: A Probabilistic Approach
This work analyzes the effects on support recovery for different choices of the hyper- or regularization parameter in LASSO-like sparse and group-sparse regression problems. The hyperparameter implicitly selects the model order of the solution, and is typically set using cross-validation (CV). This may be computationally prohibitive for large-scale problems, and also often overestimates the model order, as CV optimizes for prediction error rather than support recovery. In this work, we propose a probabilistic approach to select the hyperparameter, by quantifying the type I error (false positive rate) using extreme value analysis. From Monte Carlo simulations, one may draw inference on the upper tail of the distribution of the spurious parameter estimates, and the regularization level may be selected for a specified false positive rate. By solving the e group-LASSO problem, the choice of hyperparameter becomes independent of the noise variance. Furthermore, the effects on the false positive rate caused by collinearity in the dictionary is discussed, including ways of circumventing them. The proposed method is compared to other hyperparameter-selection methods in terms of support recovery, false positive rate, false negative rate, and computational complexity. Simulated data illustrate how the proposed method outperforms CV and comparable methods in both computational complexity and support recovery
Computationally Efficient Robust Widely Linear Beamforming for Improper Non-Stationary Signals
In this work, we introduce a computationally efficient Kalman-filter based implementation of the robust widely linear (WL) minimum variance distortionless response (MVDR) beamformer. The beamformer is able to achieve the same performance as the recently derived robust WL MVDR beamformer, but avoids the computationally burdensome solution based on a second order cone programming (SOCP), and exploiting the recent Kalman-based regular robust MVDR beamformer, extends this to also allow for non-circular sources and interferences. Numerical simulations illustrate the achieved performance
Non-Parametric Data-Dependent Estimation of Specroscopic Echo-Train Signals
This paper proposes a novel non-parametric estimator for spectroscopic echo-train signals, termed ETCAPA, to be used as a robust and reliable first-approach-technique for new, unknown, or partly disturbed substances. Exploiting the complete echo structure for the signal of interest, the method reliably estimates all parameters of interest, enabling initial estimates for the identification procedure to follow. Extending the recent dCapon and dAPES algorithms, ETCAPA exploits a data-dependent filter-bank formulation together with a non-linear minimization to give a hitherto unobtained non-parametric estimate of the echo train decay. The proposed estimator is evaluated on both simulated and measured NQR signals, clearly showing the excellent performance of the method, even in the case of strong interferences