10,495 research outputs found
Decomposition tables for experiments I. A chain of randomizations
One aspect of evaluating the design for an experiment is the discovery of the
relationships between subspaces of the data space. Initially we establish the
notation and methods for evaluating an experiment with a single randomization.
Starting with two structures, or orthogonal decompositions of the data space,
we describe how to combine them to form the overall decomposition for a
single-randomization experiment that is ``structure balanced.'' The
relationships between the two structures are characterized using efficiency
factors. The decomposition is encapsulated in a decomposition table. Then, for
experiments that involve multiple randomizations forming a chain, we take
several structures that pairwise are structure balanced and combine them to
establish the form of the orthogonal decomposition for the experiment. In
particular, it is proven that the properties of the design for such an
experiment are derived in a straightforward manner from those of the individual
designs. We show how to formulate an extended decomposition table giving the
sources of variation, their relationships and their degrees of freedom, so that
competing designs can be evaluated.Comment: Published in at http://dx.doi.org/10.1214/09-AOS717 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Decomposition tables for experiments. II. Two--one randomizations
We investigate structure for pairs of randomizations that do not follow each
other in a chain. These are unrandomized-inclusive, independent, coincident or
double randomizations. This involves taking several structures that satisfy
particular relations and combining them to form the appropriate orthogonal
decomposition of the data space for the experiment. We show how to establish
the decomposition table giving the sources of variation, their relationships
and their degrees of freedom, so that competing designs can be evaluated. This
leads to recommendations for when the different types of multiple randomization
should be used.Comment: Published in at http://dx.doi.org/10.1214/09-AOS785 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Tensor Networks for Big Data Analytics and Large-Scale Optimization Problems
In this paper we review basic and emerging models and associated algorithms
for large-scale tensor networks, especially Tensor Train (TT) decompositions
using novel mathematical and graphical representations. We discus the concept
of tensorization (i.e., creating very high-order tensors from lower-order
original data) and super compression of data achieved via quantized tensor
train (QTT) networks. The purpose of a tensorization and quantization is to
achieve, via low-rank tensor approximations "super" compression, and
meaningful, compact representation of structured data. The main objective of
this paper is to show how tensor networks can be used to solve a wide class of
big data optimization problems (that are far from tractable by classical
numerical methods) by applying tensorization and performing all operations
using relatively small size matrices and tensors and applying iteratively
optimized and approximative tensor contractions.
Keywords: Tensor networks, tensor train (TT) decompositions, matrix product
states (MPS), matrix product operators (MPO), basic tensor operations,
tensorization, distributed representation od data optimization problems for
very large-scale problems: generalized eigenvalue decomposition (GEVD),
PCA/SVD, canonical correlation analysis (CCA).Comment: arXiv admin note: text overlap with arXiv:1403.204
Harmonic analysis on a finite homogeneous space
In this paper, we study harmonic analysis on finite homogeneous spaces whose
associated permutation representation decomposes with multiplicity. After a
careful look at Frobenius reciprocity and transitivity of induction, and the
introduction of three types of spherical functions, we develop a theory of
Gelfand Tsetlin bases for permutation representations. Then we study several
concrete examples on the symmetric groups, generalizing the Gelfand pair of the
Johnson scheme; we also consider statistical and probabilistic applications.
After that, we consider the composition of two permutation representations,
giving a non commutative generalization of the Gelfand pair associated to the
ultrametric space; actually, we study the more general notion of crested
product. Finally, we consider the exponentiation action, generalizing the
decomposition of the Gelfand pair of the Hamming scheme; actually, we study a
more general construction that we call wreath product of permutation
representations, suggested by the study of finite lamplighter random walks. We
give several examples of concrete decompositions of permutation representations
and several explicit 'rules' of decomposition.Comment: 69 page
Tensor decompositions for learning latent variable models
This work considers a computationally and statistically efficient parameter
estimation method for a wide class of latent variable models---including
Gaussian mixture models, hidden Markov models, and latent Dirichlet
allocation---which exploits a certain tensor structure in their low-order
observable moments (typically, of second- and third-order). Specifically,
parameter estimation is reduced to the problem of extracting a certain
(orthogonal) decomposition of a symmetric tensor derived from the moments; this
decomposition can be viewed as a natural generalization of the singular value
decomposition for matrices. Although tensor decompositions are generally
intractable to compute, the decomposition of these specially structured tensors
can be efficiently obtained by a variety of approaches, including power
iterations and maximization approaches (similar to the case of matrices). A
detailed analysis of a robust tensor power method is provided, establishing an
analogue of Wedin's perturbation theorem for the singular vectors of matrices.
This implies a robust and computationally tractable estimation approach for
several popular latent variable models
- …