8,512 research outputs found
Robustness in sparse linear models: relative efficiency based on robust approximate message passing
Understanding efficiency in high dimensional linear models is a longstanding
problem of interest. Classical work with smaller dimensional problems dating
back to Huber and Bickel has illustrated the benefits of efficient loss
functions. When the number of parameters is of the same order as the sample
size , , an efficiency pattern different from the one of Huber
was recently established. In this work, we consider the effects of model
selection on the estimation efficiency of penalized methods. In particular, we
explore whether sparsity, results in new efficiency patterns when . In
the interest of deriving the asymptotic mean squared error for regularized
M-estimators, we use the powerful framework of approximate message passing. We
propose a novel, robust and sparse approximate message passing algorithm
(RAMP), that is adaptive to the error distribution. Our algorithm includes many
non-quadratic and non-differentiable loss functions. We derive its asymptotic
mean squared error and show its convergence, while allowing , with and . We identify new
patterns of relative efficiency regarding a number of penalized estimators,
when is much larger than . We show that the classical information bound
is no longer reachable, even for light--tailed error distributions. We show
that the penalized least absolute deviation estimator dominates the penalized
least square estimator, in cases of heavy--tailed distributions. We observe
this pattern for all choices of the number of non-zero parameters , both and . In non-penalized problems where ,
the opposite regime holds. Therefore, we discover that the presence of model
selection significantly changes the efficiency patterns.Comment: 49 pages, 10 figure
Tensor decompositions for learning latent variable models
This work considers a computationally and statistically efficient parameter
estimation method for a wide class of latent variable models---including
Gaussian mixture models, hidden Markov models, and latent Dirichlet
allocation---which exploits a certain tensor structure in their low-order
observable moments (typically, of second- and third-order). Specifically,
parameter estimation is reduced to the problem of extracting a certain
(orthogonal) decomposition of a symmetric tensor derived from the moments; this
decomposition can be viewed as a natural generalization of the singular value
decomposition for matrices. Although tensor decompositions are generally
intractable to compute, the decomposition of these specially structured tensors
can be efficiently obtained by a variety of approaches, including power
iterations and maximization approaches (similar to the case of matrices). A
detailed analysis of a robust tensor power method is provided, establishing an
analogue of Wedin's perturbation theorem for the singular vectors of matrices.
This implies a robust and computationally tractable estimation approach for
several popular latent variable models
Numerical Analysis
Acknowledgements: This article will appear in the forthcoming Princeton Companion to Mathematics, edited by Timothy Gowers with June Barrow-Green, to be published by Princeton University Press.\ud
\ud
In preparing this essay I have benefitted from the advice of many colleagues who corrected a number of errors of fact and emphasis. I have not always followed their advice, however, preferring as one friend put it, to "put my head above the parapet". So I must take full responsibility for errors and omissions here.\ud
\ud
With thanks to: Aurelio Arranz, Alexander Barnett, Carl de Boor, David Bindel, Jean-Marc Blanc, Mike Bochev, Folkmar Bornemann, Richard Brent, Martin Campbell-Kelly, Sam Clark, Tim Davis, Iain Duff, Stan Eisenstat, Don Estep, Janice Giudice, Gene Golub, Nick Gould, Tim Gowers, Anne Greenbaum, Leslie Greengard, Martin Gutknecht, Raphael Hauser, Des Higham, Nick Higham, Ilse Ipsen, Arieh Iserles, David Kincaid, Louis Komzsik, David Knezevic, Dirk Laurie, Randy LeVeque, Bill Morton, John C Nash, Michael Overton, Yoshio Oyanagi, Beresford Parlett, Linda Petzold, Bill Phillips, Mike Powell, Alex Prideaux, Siegfried Rump, Thomas Schmelzer, Thomas Sonar, Hans Stetter, Gil Strang, Endre SĂŒli, Defeng Sun, Mike Sussman, Daniel Szyld, Garry Tee, Dmitry Vasilyev, Andy Wathen, Margaret Wright and Steve Wright
False Discovery and Its Control in Low Rank Estimation
Models specified by low-rank matrices are ubiquitous in contemporary
applications. In many of these problem domains, the row/column space structure
of a low-rank matrix carries information about some underlying phenomenon, and
it is of interest in inferential settings to evaluate the extent to which the
row/column spaces of an estimated low-rank matrix signify discoveries about the
phenomenon. However, in contrast to variable selection, we lack a formal
framework to assess true/false discoveries in low-rank estimation; in
particular, the key source of difficulty is that the standard notion of a
discovery is a discrete one that is ill-suited to the smooth structure
underlying low-rank matrices. We address this challenge via a geometric
reformulation of the concept of a discovery, which then enables a natural
definition in the low-rank case. We describe and analyze a generalization of
the Stability Selection method of Meinshausen and B\"uhlmann to control for
false discoveries in low-rank estimation, and we demonstrate its utility
compared to previous approaches via numerical experiments
Parametric Regression on the Grassmannian
We address the problem of fitting parametric curves on the Grassmann manifold
for the purpose of intrinsic parametric regression. As customary in the
literature, we start from the energy minimization formulation of linear
least-squares in Euclidean spaces and generalize this concept to general
nonflat Riemannian manifolds, following an optimal-control point of view. We
then specialize this idea to the Grassmann manifold and demonstrate that it
yields a simple, extensible and easy-to-implement solution to the parametric
regression problem. In fact, it allows us to extend the basic geodesic model to
(1) a time-warped variant and (2) cubic splines. We demonstrate the utility of
the proposed solution on different vision problems, such as shape regression as
a function of age, traffic-speed estimation and crowd-counting from
surveillance video clips. Most notably, these problems can be conveniently
solved within the same framework without any specifically-tailored steps along
the processing pipeline.Comment: 14 pages, 11 figure
Improved Distributed Estimation Method for Environmental\ud time-variant Physical variables in Static Sensor Networks
In this paper, an improved distributed estimation scheme for static sensor networks is developed. The scheme is developed for environmental time-variant physical variables. The main contribution of this work is that the algorithm in [1]-[3] has been extended, and a filter has been designed with weights, such that the variance of the estimation errors is minimized, thereby improving the filter design considerably\ud
and characterizing the performance limit of the filter, and thereby tracking a time-varying signal. Moreover, certain parameter optimization is alleviated with the application of a particular finite impulse response (FIR) filter. Simulation results are showing the effectiveness of the developed estimation algorithm
A critical cluster analysis of 44 indicators of author-level performance
This paper explores the relationship between author-level bibliometric
indicators and the researchers the "measure", exemplified across five academic
seniorities and four disciplines. Using cluster methodology, the disciplinary
and seniority appropriateness of author-level indicators is examined.
Publication and citation data for 741 researchers across Astronomy,
Environmental Science, Philosophy and Public Health was collected in Web of
Science (WoS). Forty-four indicators of individual performance were computed
using the data. A two-step cluster analysis using IBM SPSS version 22 was
performed, followed by a risk analysis and ordinal logistic regression to
explore cluster membership. Indicator scores were contextualized using the
individual researcher's curriculum vitae. Four different clusters based on
indicator scores ranked researchers as low, middle, high and extremely high
performers. The results show that different indicators were appropriate in
demarcating ranked performance in different disciplines. In Astronomy the h2
indicator, sum pp top prop in Environmental Science, Q2 in Philosophy and
e-index in Public Health. The regression and odds analysis showed individual
level indicator scores were primarily dependent on the number of years since
the researcher's first publication registered in WoS, number of publications
and number of citations. Seniority classification was secondary therefore no
seniority appropriate indicators were confidently identified. Cluster
methodology proved useful in identifying disciplinary appropriate indicators
providing the preliminary data preparation was thorough but needed to be
supplemented by other analyses to validate the results. A general disconnection
between the performance of the researcher on their curriculum vitae and the
performance of the researcher based on bibliometric indicators was observed.Comment: 28 pages, 7 tables, 2 figures, 2 appendice
Creative clusters in Europe: a microdata approach
Creative industries are highly concentrated forming clusters. One of the main problems for the identification of clusters of creative industries in Europe is the lack of data, constrained in practice to regions (NUTS 2) and influenced by the heterogeneity in the definition of NUTS across countries. This research uses firm-level data geo-referenced at address level and geostatistical modeling to identify clusters of creative industries in fifteen European countries. The procedure is independent of administrative divisions and national boundaries and allows to produce a precise geography of the clusters of creative industries in Europe.
- âŠ