Search CORE

130 research outputs found

Rate Optimal Denoising of Simultaneously Sparse and Low Rank Matrices

Author: Buja Andreas
Ma Zongming
Yang Dan
Publication venue
Publication date: 01/05/2014
Field of study

We study minimax rates for denoising simultaneously sparse and low rank matrices in high dimensions. We show that an iterative thresholding algorithm achieves (near) optimal rates adaptively under mild conditions for a large class of loss functions. Numerical experiments on synthetic datasets also demonstrate the competitive performance of the proposed method

arXiv.org e-Print Archive

CiteSeerX

Comment: Boosting Algorithms: Regularization, Prediction and Model Fitting

Author: Buja Andreas
Mease David
Wyner Abraham J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 17/04/2008
Field of study

The authors are doing the readers of Statistical Science a true service with a well-written and up-to-date overview of boosting that originated with the seminal algorithms of Freund and Schapire. Equally, we are grateful for high-level software that will permit a larger readership to experiment with, or simply apply, boosting-inspired model fitting. The authors show us a world of methodology that illustrates how a fundamental innovation can penetrate every nook and cranny of statistical thinking and practice. They introduce the reader to one particular interpretation of boosting and then give a display of its potential with extensions from classification (where it all started) to least squares, exponential family models, survival analysis, to base-learners other than trees such as smoothing splines, to degrees of freedom and regularization, and to fascinating recent work in model selection. The uninitiated reader will find that the authors did a nice job of presenting a certain coherent and useful interpretation of boosting. The other reader, though, who has watched the business of boosting for a while, may have quibbles with the authors over details of the historic record and, more importantly, over their optimism about the current state of theoretical knowledge. In fact, as much as ``the statistical view'' has proven fruitful, it has also resulted in some ideas about why boosting works that may be misconceived, and in some recommendations that may be misguided. [arXiv:0804.2752]Comment: Published in at http://dx.doi.org/10.1214/07-STS242B the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Functional principal components analysis via penalized rank one approximation

Author: Buja Andreas
Huang Jianhua Z.
Shen Haipeng
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

Two existing approaches to functional principal components analysis (FPCA) are due to Rice and Silverman (1991) and Silverman (1996), both based on maximizing variance but introducing penalization in different ways. In this article we propose an alternative approach to FPCA using penalized rank one approximation to the data matrix. Our contributions are four-fold: (1) by considering invariance under scale transformation of the measurements, the new formulation sheds light on how regularization should be performed for FPCA and suggests an efficient power algorithm for computation; (2) it naturally incorporates spline smoothing of discretized functional data; (3) the connection with smoothing splines also facilitates construction of cross-validation or generalized cross-validation criteria for smoothing parameter selection that allows efficient computation; (4) different smoothing parameters are permitted for different FPCs. The methodology is illustrated with a real data example and a simulation.Comment: Published in at http://dx.doi.org/10.1214/08-EJS218 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Texas A&M Repository

Carolina Digital Repository

ScholarlyCommons@Penn

Local Multidimensional Scaling for Nonlinear Dimension Reduction, Graph Drawing and Proximity Analysis

Author: Buja Andreas
Chen Lisha
Publication venue: ScholarlyCommons
Publication date: 01/01/2009
Field of study

In the past decade there has been a resurgence of interest in nonlinear dimension reduction. Among new proposals are “Local Linear Embedding,” “Isomap,” and Kernel Principal Components Analysis which all construct global low-dimensional embeddings from local affine or metric information. We introduce a competing method called “Local Multidimensional Scaling” (LMDS). Like LLE, Isomap, and KPCA, LMDS constructs its global embedding from local information, but it uses instead a combination of MDS and “force-directed” graph drawing. We apply the force paradigm to create localized versions of MDS stress functions with a tuning parameter to adjust the strength of nonlocal repulsive forces. We solve the problem of tuning parameter selection with a meta-criterion that measures how well the sets of K-nearest neighbors agree between the data and the embedding. Tuned LMDS seems to be able to outperform MDS, PCA, LLE, Isomap, and KPCA, as illustrated with two well-known image datasets. The meta-criterion can also be used in a pointwise version as a diagnostic tool for measuring the local adequacy of embeddings and thereby detect local problems in dimension reductions

ScholarlyCommons@Penn

tourr: An R Package for Exploring Multivariate Data with Projections

Author: Andreas Buja
Dianne Cook
Hadley Wickham
Heike Hofmann
Publication venue
Publication date
Field of study

This paper describes an R package which produces tours of multivariate data. The package includes functions for creating different types of tours, including grand, guided, and little tours, which project multivariate data (p-D) down to 1, 2, 3, or, more generally, d (Ã¢ÂÂ¤ p) dimensions. The projected data can be rendered as densities or histograms, scatterplots, anaglyphs, glyphs, scatterplot matrices, parallel coordinate plots, time series or images, and viewed using an R graphics device, passed to GGobi, or saved to disk. A tour path can be stored for visualisation or replay. With this package it is possible to quickly experiment with different, and new, approaches to tours of data. This paper contains animations that can be viewed using the Adobe Acrobat PDF viewer.

Research Papers in Economics

Visual Comparison of Datasets Using Mixture Decompositions

Author: Buja Andreas
Gous Alan
Publication venue: ScholarlyCommons
Publication date: 01/01/2004
Field of study

This article describes how a mixture of two densities, f0 and f1, may be decomposed into a different mixture consisting of three densities. These new densities, f+, f−, and f=, summarize differences between f0 and f1: f+ is high in areas of excess of f1 compared to f0; f− represents deficiency of f1 compared to f0 in the same way; f= represents commonality between f1 and f0. The supports of f+ and f− are disjoint. This decomposition of the mixture of f0 and f1 is similar to the set-theoretic decomposition of the union of two sets A and B into the disjoint sets A\B, B\A, and A ∩ B. Sample points from f0 and f1 can be assigned to one of these three densities, allowing the differences between f0 and f1 to be visualized in a single plot, a visual hypothesis test of whether f0 is equal to f1. We describe two similar such decompositions and contrast their behavior under the null hypothesis f0 = f1, giving some insight into how such plots may be interpreted. We present two examples of uses of these methods: visualization of departures from independence, and of a two-class classification problem. Other potential applications are discussed

ScholarlyCommons@Penn

A Conversation with Peter Huber

Author: Buja Andreas
Künsch Hans R.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 06/08/2008
Field of study

Peter J. Huber was born on March 25, 1934, in Wohlen, a small town in the Swiss countryside. He obtained a diploma in mathematics in 1958 and a Ph.D. in mathematics in 1961, both from ETH Zurich. His thesis was in pure mathematics, but he then decided to go into statistics. He spent 1961--1963 as a postdoc at the statistics department in Berkeley where he wrote his first and most famous paper on robust statistics, ``Robust Estimation of a Location Parameter.'' After a position as a visiting professor at Cornell University, he became a full professor at ETH Zurich. He worked at ETH until 1978, interspersed by visiting positions at Cornell, Yale, Princeton and Harvard. After leaving ETH, he held professor positions at Harvard University 1978--1988, at MIT 1988--1992, and finally at the University of Bayreuth from 1992 until his retirement in 1999. He now lives in Klosters, a village in the Grisons in the Swiss Alps. Peter Huber has published four books and over 70 papers on statistics and data analysis. In addition, he has written more than a dozen papers and two books on Babylonian mathematics, astronomy and history. In 1972, he delivered the Wald lectures. He is a fellow of the IMS, of the American Association for the Advancement of Science, and of the American Academy of Arts and Sciences. In 1988 he received a Humboldt Award and in 1994 an honorary doctorate from the University of Neuch\^{a}tel. In addition to his fundamental results in robust statistics, Peter Huber made important contributions to computational statistics, strategies in data analysis, and applications of statistics in fields such as crystallography, EEGs, and human growth curves.Comment: Published in at http://dx.doi.org/10.1214/07-STS251 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref