1,595 research outputs found
On statistics, computation and scalability
How should statistical procedures be designed so as to be scalable
computationally to the massive datasets that are increasingly the norm? When
coupled with the requirement that an answer to an inferential question be
delivered within a certain time budget, this question has significant
repercussions for the field of statistics. With the goal of identifying
"time-data tradeoffs," we investigate some of the statistical consequences of
computational perspectives on scability, in particular divide-and-conquer
methodology and hierarchies of convex relaxations.Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP17 the Bernoulli
(http://isi.cbs.nl/bernoulli/) by the International Statistical
Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm
Bayesian factorizations of big sparse tensors
It has become routine to collect data that are structured as multiway arrays
(tensors). There is an enormous literature on low rank and sparse matrix
factorizations, but limited consideration of extensions to the tensor case in
statistics. The most common low rank tensor factorization relies on parallel
factor analysis (PARAFAC), which expresses a rank tensor as a sum of rank
one tensors. When observations are only available for a tiny subset of the
cells of a big tensor, the low rank assumption is not sufficient and PARAFAC
has poor performance. We induce an additional layer of dimension reduction by
allowing the effective rank to vary across dimensions of the table. For
concreteness, we focus on a contingency table application. Taking a Bayesian
approach, we place priors on terms in the factorization and develop an
efficient Gibbs sampler for posterior computation. Theory is provided showing
posterior concentration rates in high-dimensional settings, and the methods are
shown to have excellent performance in simulations and several real data
applications
Highly parallel sparse Cholesky factorization
Several fine grained parallel algorithms were developed and compared to compute the Cholesky factorization of a sparse matrix. The experimental implementations are on the Connection Machine, a distributed memory SIMD machine whose programming model conceptually supplies one processor per data element. In contrast to special purpose algorithms in which the matrix structure conforms to the connection structure of the machine, the focus is on matrices with arbitrary sparsity structure. The most promising algorithm is one whose inner loop performs several dense factorizations simultaneously on a 2-D grid of processors. Virtually any massively parallel dense factorization algorithm can be used as the key subroutine. The sparse code attains execution rates comparable to those of the dense subroutine. Although at present architectural limitations prevent the dense factorization from realizing its potential efficiency, it is concluded that a regular data parallel architecture can be used efficiently to solve arbitrarily structured sparse problems. A performance model is also presented and it is used to analyze the algorithms
- …