216 research outputs found
A Kernel Test for Three-Variable Interactions
We introduce kernel nonparametric tests for Lancaster three-variable
interaction and for total independence, using embeddings of signed measures
into a reproducing kernel Hilbert space. The resulting test statistics are
straightforward to compute, and are used in powerful interaction tests, which
are consistent against all alternatives for a large family of reproducing
kernels. We show the Lancaster test to be sensitive to cases where two
independent causes individually have weak influence on a third dependent
variable, but their combined effect has a strong influence. This makes the
Lancaster test especially suited to finding structure in directed graphical
models, where it outperforms competing nonparametric tests in detecting such
V-structures
Spectra of large time-lagged correlation matrices from Random Matrix Theory
We analyze the spectral properties of large, time-lagged correlation matrices
using the tools of random matrix theory. We compare predictions of the
one-dimensional spectra, based on approaches already proposed in the
literature. Employing the methods of free random variables and diagrammatic
techniques, we solve a general random matrix problem, namely the spectrum of a
matrix , where is an Gaussian random
matrix and is \textit{any} , not necessarily symmetric
(Hermitian) matrix. As a particular application, we present the spectral
features of the large lagged correlation matrices as a function of the depth of
the time-lag. We also analyze the properties of left and right eigenvector
correlations for the time-lagged matrices. We positively verify our results by
the numerical simulations.Comment: 44 pages, 11 figures; v2 typos corrected, final versio
Binary hidden Markov models and varieties
The technological applications of hidden Markov models have been extremely
diverse and successful, including natural language processing, gesture
recognition, gene sequencing, and Kalman filtering of physical measurements.
HMMs are highly non-linear statistical models, and just as linear models are
amenable to linear algebraic techniques, non-linear models are amenable to
commutative algebra and algebraic geometry.
This paper closely examines HMMs in which all the hidden random variables are
binary. Its main contributions are (1) a birational parametrization for every
such HMM, with an explicit inverse for recovering the hidden parameters in
terms of observables, (2) a semialgebraic model membership test for every such
HMM, and (3) minimal defining equations for the 4-node fully binary model,
comprising 21 quadrics and 29 cubics, which were computed using Grobner bases
in the cumulant coordinates of Sturmfels and Zwiernik. The new model parameters
in (1) are rationally identifiable in the sense of Sullivant, Garcia-Puente,
and Spielvogel, and each model's Zariski closure is therefore a rational
projective variety of dimension 5. Grobner basis computations for the model and
its graph are found to be considerably faster using these parameters. In the
case of two hidden states, item (2) supersedes a previous algorithm of
Schonhuth which is only generically defined, and the defining equations (3)
yield new invariants for HMMs of all lengths . Such invariants have
been used successfully in model selection problems in phylogenetics, and one
can hope for similar applications in the case of HMMs
Appell polynomials and their relatives II. Boolean theory
The Appell-type polynomial family corresponding to the simplest
non-commutative derivative operator turns out to be connected with the Boolean
probability theory, the simplest of the three universal non-commutative
probability theories (the other two being free and tensor/classical
probability). The basic properties of the Boolean Appell polynomials are
described. In particular, their generating function turns out to have a
resolvent-type form, just like the generating function for the free Sheffer
polynomials. It follows that the Meixner (that is, Sheffer plus orthogonal)
polynomial classes, in the Boolean and free theory, coincide. This is true even
in the multivariate case. A number of applications of this fact are described,
to the Belinschi-Nica and Bercovici-Pata maps, conditional freeness, and the
Laha-Lukacs type characterization.
A number of properties which hold for the Meixner class in the free and
classical cases turn out to hold in general in the Boolean theory. Examples
include the behavior of the Jacobi coefficients under convolution, the
relationship between the Jacobi coefficients and cumulants, and an operator
model for cumulants. Along the way, we obtain a multivariate version of the
Stieltjes continued fraction expansion for the moment generating function of an
arbitrary state with monic orthogonal polynomials
Recommended from our members
Factor Analysis of Data Matrices: New Theoretical and Computational Aspects With Applications
The classical fitting problem in exploratory factor analysis (EFA) is to find estimates for the factor loadings matrix and the matrix of unique factor variances which give the best fit to the sample covariance or correlation matrix with respect to some goodness-of-fit criterion. Predicted factor scores can be obtained as a function of these estimates and the data. In this thesis, the EFA model is considered as a specific data matrix decomposition with fixed unknown matrix parameters. Fitting the EFA model directly to the data yields simultaneous solutions for both loadings and factor scores. Several new algorithms are introduced for the least squares and weighted least squares estimation of all EFA model unknowns. The numerical procedures are based on the singular value decomposition, facilitate the estimation of both common and unique factor scores, and work equally well when the number of variables exceeds the number of available observations.
Like EFA, noisy independent component analysis (ICA) is a technique for reduction of the data dimensionality in which the interrelationships among the observed variables are explained in terms of a much smaller number of latent factors. The key difference between EFA and noisy ICA is that in the latter model the common factors are assumed to be both independent and non-normal. In contrast to EFA, there is no rotational indeterminacy in noisy ICA. In this thesis, noisy ICA is viewed as a method of factor rotation in EFA. Starting from an initial EFA solution, an orthogonal rotation matrix is sought that minimizes the dependence between the common factors. The idea of rotating the scores towards independence is also employed in three-mode factor analysis to analyze data sets having a three-way structure.
The new theoretical and computational aspects contained in this thesis are illustrated by means of several examples with real and artificial data
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
- …