429 research outputs found
Group Iterative Spectrum Thresholding for Super-Resolution Sparse Spectral Selection
Recently, sparsity-based algorithms are proposed for super-resolution
spectrum estimation. However, to achieve adequately high resolution in
real-world signal analysis, the dictionary atoms have to be close to each other
in frequency, thereby resulting in a coherent design. The popular convex
compressed sensing methods break down in presence of high coherence and large
noise. We propose a new regularization approach to handle model collinearity
and obtain parsimonious frequency selection simultaneously. It takes advantage
of the pairing structure of sine and cosine atoms in the frequency dictionary.
A probabilistic spectrum screening is also developed for fast computation in
high dimensions. A data-resampling version of high-dimensional Bayesian
Information Criterion is used to determine the regularization parameters.
Experiments show the efficacy and efficiency of the proposed algorithms in
challenging situations with small sample size, high frequency resolution, and
low signal-to-noise ratio
Prediction Weighted Maximum Frequency Selection
Shrinkage estimators that possess the ability to produce sparse solutions
have become increasingly important to the analysis of today's complex datasets.
Examples include the LASSO, the Elastic-Net and their adaptive counterparts.
Estimation of penalty parameters still presents difficulties however. While
variable selection consistent procedures have been developed, their finite
sample performance can often be less than satisfactory. We develop a new
strategy for variable selection using the adaptive LASSO and adaptive
Elastic-Net estimators with diverging. The basic idea first involves
using the trace paths of their LARS solutions to bootstrap estimates of maximum
frequency (MF) models conditioned on dimension. Conditioning on dimension
effectively mitigates overfitting, however to deal with underfitting, these MFs
are then prediction-weighted, and it is shown that not only can consistent
model selection be achieved, but that attractive convergence rates can as well,
leading to excellent finite sample performance. Detailed numerical studies are
carried out on both simulated and real datasets. Extensions to the class of
generalized linear models are also detailed.Comment: This manuscript contains 41 pages and 14 figure
Uncertainty Quantification in Lasso-Type Regularization Problems
Regularization techniques, which sit at the interface of statistical modeling and machine learning, are often used in the engineering or other applied sciences to tackle high dimensional regression (type) problems. While a number of regularization methods are commonly used, the 'Least Absolute Shrinkage and Selection Operator' or simply LASSO is popular because of its efficient variable selection property. This property of the LASSO helps to deal with problems where the number of predictors is larger than the total number of observations, as it shrinks the coefficients of non-important parameters to zero. In this chapter, both frequentist and Bayesian approaches for the LASSO are discussed, with particular attention to the problem of uncertainty quantification of regression parameters. For the frequentist approach, we discuss a refit technique as well as the classical bootstrap method, and for the Bayesian method, we make use of the equivalent LASSO formulation using a Laplace prior on the model parameters
Incorporating Prior Knowledge into Nonparametric Conditional Density Estimation
In this paper, the problem of sparse nonparametric conditional density estimation based on samples and prior knowledge is addressed. The prior knowledge may be restricted to parts of the state space and given as generative models in form of mean-function constraints or as probabilistic models in the form of Gaussian mixtures. The key idea is the introduction of additional constraints and a modified kernel function into the conditional density estimation problem. This approach to using prior knowledge is a generic solution applicable to all nonparametric conditional density estimation approaches phrased as constrained optimization problems. The quality of the estimates, their sparseness, and the achievable improvements by using prior knowledge are shown in experiments for both Support-Vector Machine-based and integral distance-based conditional density estimation
Distributed Detection and Estimation in Wireless Sensor Networks
In this article we consider the problems of distributed detection and
estimation in wireless sensor networks. In the first part, we provide a general
framework aimed to show how an efficient design of a sensor network requires a
joint organization of in-network processing and communication. Then, we recall
the basic features of consensus algorithm, which is a basic tool to reach
globally optimal decisions through a distributed approach. The main part of the
paper starts addressing the distributed estimation problem. We show first an
entirely decentralized approach, where observations and estimations are
performed without the intervention of a fusion center. Then, we consider the
case where the estimation is performed at a fusion center, showing how to
allocate quantization bits and transmit powers in the links between the nodes
and the fusion center, in order to accommodate the requirement on the maximum
estimation variance, under a constraint on the global transmit power. We extend
the approach to the detection problem. Also in this case, we consider the
distributed approach, where every node can achieve a globally optimal decision,
and the case where the decision is taken at a central node. In the latter case,
we show how to allocate coding bits and transmit power in order to maximize the
detection probability, under constraints on the false alarm rate and the global
transmit power. Then, we generalize consensus algorithms illustrating a
distributed procedure that converges to the projection of the observation
vector onto a signal subspace. We then address the issue of energy consumption
in sensor networks, thus showing how to optimize the network topology in order
to minimize the energy necessary to achieve a global consensus. Finally, we
address the problem of matching the topology of the network to the graph
describing the statistical dependencies among the observed variables.Comment: 92 pages, 24 figures. To appear in E-Reference Signal Processing, R.
Chellapa and S. Theodoridis, Eds., Elsevier, 201
Contributions to Penalized Estimation
Penalized estimation is a useful statistical technique to prevent overfitting problems. In penalized methods, the common objective function is in the form of a loss function for goodness of fit plus a penalty function for complexity control. In this dissertation, we develop several new penalization approaches for various statistical models. These methods aim for effective model selection and accurate parameter estimation. The first part introduces the notion of partially overlapping models across multiple regression models on the same dataset. Such underlying models have at least one overlapping structure sharing the same parameter value. To recover the sparse and overlapping structure, we develop adaptive composite M-estimation (ACME) by doubly penalizing a composite loss function, as a weighted linear combination of the loss functions. ACME automatically circumvents the model misspecification issues inherent in other composite-loss-based estimators. The second part proposes a new refit method and its applications in the regression setting through model combination: ensemble variable selection (EVS) and ensemble variable selection and estimation (EVE). The refit method estimates the regression parameters restricted to the selected covariates by a penalization method. EVS combines model selection decisions from multiple penalization methods and selects the optimal model via the refit and a model selection criterion. EVE considers a factorizable likelihood-based model whose full likelihood is the multiplication of likelihood factors. EVE is shown to have asymptotic efficiency and computational efficiency. The third part studies a sparse undirected Gaussian graphical model (GGM) to explain conditional dependence patterns among variables. The edge set consists of conditionally dependent variable pairs and corresponds to nonzero elements of the inverse covariance matrix under the Gaussian assumption. We propose a consistent validation method for edge selection (CoVES) in the penalization framework. CoVES selects candidate edge sets along the solution path and finds the optimal set via repeated subsampling. CoVES requires simple computation and delivers excellent performance in our numerical studies.Doctor of Philosoph
A penalized inference approach to stochastic block modelling of community structure in the Italian Parliament
We analyse bill cosponsorship networks in the Italian Chamber of Deputies. In comparison with other parliaments, a distinguishing feature of the Chamber is the large number of political groups. Our analysis aims to infer the pattern of collaborations between these groups from data on bill cosponsorships. We propose an extension of stochastic block models for edge-valued graphs and derive measures of group productivity and of collaboration between political parties. As the model proposed encloses a large number of parameters, we pursue a penalized likelihood approach that enables us to infer a sparse reduced graph displaying collaborations between political parties
- …