130 research outputs found
Rate Optimal Denoising of Simultaneously Sparse and Low Rank Matrices
We study minimax rates for denoising simultaneously sparse and low rank
matrices in high dimensions. We show that an iterative thresholding algorithm
achieves (near) optimal rates adaptively under mild conditions for a large
class of loss functions. Numerical experiments on synthetic datasets also
demonstrate the competitive performance of the proposed method
Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting
The authors are doing the readers of Statistical Science a true service with
a well-written and up-to-date overview of boosting that originated with the
seminal algorithms of Freund and Schapire. Equally, we are grateful for
high-level software that will permit a larger readership to experiment with, or
simply apply, boosting-inspired model fitting. The authors show us a world of
methodology that illustrates how a fundamental innovation can penetrate every
nook and cranny of statistical thinking and practice. They introduce the reader
to one particular interpretation of boosting and then give a display of its
potential with extensions from classification (where it all started) to least
squares, exponential family models, survival analysis, to base-learners other
than trees such as smoothing splines, to degrees of freedom and regularization,
and to fascinating recent work in model selection. The uninitiated reader will
find that the authors did a nice job of presenting a certain coherent and
useful interpretation of boosting. The other reader, though, who has watched
the business of boosting for a while, may have quibbles with the authors over
details of the historic record and, more importantly, over their optimism about
the current state of theoretical knowledge. In fact, as much as ``the
statistical view'' has proven fruitful, it has also resulted in some ideas
about why boosting works that may be misconceived, and in some recommendations
that may be misguided. [arXiv:0804.2752]Comment: Published in at http://dx.doi.org/10.1214/07-STS242B the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Functional principal components analysis via penalized rank one approximation
Two existing approaches to functional principal components analysis (FPCA)
are due to Rice and Silverman (1991) and Silverman (1996), both based on
maximizing variance but introducing penalization in different ways. In this
article we propose an alternative approach to FPCA using penalized rank one
approximation to the data matrix. Our contributions are four-fold: (1) by
considering invariance under scale transformation of the measurements, the new
formulation sheds light on how regularization should be performed for FPCA and
suggests an efficient power algorithm for computation; (2) it naturally
incorporates spline smoothing of discretized functional data; (3) the
connection with smoothing splines also facilitates construction of
cross-validation or generalized cross-validation criteria for smoothing
parameter selection that allows efficient computation; (4) different smoothing
parameters are permitted for different FPCs. The methodology is illustrated
with a real data example and a simulation.Comment: Published in at http://dx.doi.org/10.1214/08-EJS218 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing and Proximity Analysis
In the past decade there has been a resurgence of interest in nonlinear dimension reduction. Among new proposals are “Local Linear Embedding,” “Isomap,” and Kernel Principal Components Analysis which all construct global low-dimensional embeddings from local affine or metric information. We introduce a competing method called “Local Multidimensional Scaling” (LMDS). Like LLE, Isomap, and KPCA, LMDS constructs its global embedding from local information, but it uses instead a combination of MDS and “force-directed” graph drawing. We apply the force paradigm to create localized versions of MDS stress functions with a tuning parameter to adjust the strength of nonlocal repulsive forces.
We solve the problem of tuning parameter selection with a meta-criterion that measures how well the sets of K-nearest neighbors agree between the data and the embedding. Tuned LMDS seems to be able to outperform MDS, PCA, LLE, Isomap, and KPCA, as illustrated with two well-known image datasets. The meta-criterion can also be used in a pointwise version as a diagnostic tool for measuring the local adequacy of embeddings and thereby detect local problems in dimension reductions
tourr: An R Package for Exploring Multivariate Data with Projections
This paper describes an R package which produces tours of multivariate data. The package includes functions for creating different types of tours, including grand, guided, and little tours, which project multivariate data (p-D) down to 1, 2, 3, or, more generally, d (⤠p) dimensions. The projected data can be rendered as densities or histograms, scatterplots, anaglyphs, glyphs, scatterplot matrices, parallel coordinate plots, time series or images, and viewed using an R graphics device, passed to GGobi, or saved to disk. A tour path can be stored for visualisation or replay. With this package it is possible to quickly experiment with different, and new, approaches to tours of data. This paper contains animations that can be viewed using the Adobe Acrobat PDF viewer.
Visual Comparison of Datasets Using Mixture Decompositions
This article describes how a mixture of two densities, f0 and f1, may be decomposed into a different mixture consisting of three densities. These new densities, f+, f−, and f=, summarize differences between f0 and f1: f+ is high in areas of excess of f1 compared to f0; f− represents deficiency of f1 compared to f0 in the same way; f= represents commonality between f1 and f0. The supports of f+ and f− are disjoint. This decomposition of the mixture of f0 and f1 is similar to the set-theoretic decomposition of the union of two sets A and B into the disjoint sets A\B, B\A, and A ∩ B. Sample points from f0 and f1 can be assigned to one of these three densities, allowing the differences between f0 and f1 to be visualized in a single plot, a visual hypothesis test of whether f0 is equal to f1. We describe two similar such decompositions and contrast their behavior under the null hypothesis f0 = f1, giving some insight into how such plots may be interpreted. We present two examples of uses of these methods: visualization of departures from independence, and of a two-class classification problem. Other potential applications are discussed
A Conversation with Peter Huber
Peter J. Huber was born on March 25, 1934, in Wohlen, a small town in the
Swiss countryside. He obtained a diploma in mathematics in 1958 and a Ph.D. in
mathematics in 1961, both from ETH Zurich. His thesis was in pure mathematics,
but he then decided to go into statistics. He spent 1961--1963 as a postdoc at
the statistics department in Berkeley where he wrote his first and most famous
paper on robust statistics, ``Robust Estimation of a Location Parameter.''
After a position as a visiting professor at Cornell University, he became a
full professor at ETH Zurich. He worked at ETH until 1978, interspersed by
visiting positions at Cornell, Yale, Princeton and Harvard. After leaving ETH,
he held professor positions at Harvard University 1978--1988, at MIT
1988--1992, and finally at the University of Bayreuth from 1992 until his
retirement in 1999. He now lives in Klosters, a village in the Grisons in the
Swiss Alps. Peter Huber has published four books and over 70 papers on
statistics and data analysis. In addition, he has written more than a dozen
papers and two books on Babylonian mathematics, astronomy and history. In 1972,
he delivered the Wald lectures. He is a fellow of the IMS, of the American
Association for the Advancement of Science, and of the American Academy of Arts
and Sciences. In 1988 he received a Humboldt Award and in 1994 an honorary
doctorate from the University of Neuch\^{a}tel. In addition to his fundamental
results in robust statistics, Peter Huber made important contributions to
computational statistics, strategies in data analysis, and applications of
statistics in fields such as crystallography, EEGs, and human growth curves.Comment: Published in at http://dx.doi.org/10.1214/07-STS251 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
- …