8,539 research outputs found
Estimation of a Covariance Matrix with Zeros
We consider estimation of the covariance matrix of a multivariate random
vector under the constraint that certain covariances are zero. We first present
an algorithm, which we call Iterative Conditional Fitting, for computing the
maximum likelihood estimator of the constrained covariance matrix, under the
assumption of multivariate normality. In contrast to previous approaches, this
algorithm has guaranteed convergence properties. Dropping the assumption of
multivariate normality, we show how to estimate the covariance matrix in an
empirical likelihood approach. These approaches are then compared via
simulation and on an example of gene expression.Comment: 25 page
Herding as a Learning System with Edge-of-Chaos Dynamics
Herding defines a deterministic dynamical system at the edge of chaos. It
generates a sequence of model states and parameters by alternating parameter
perturbations with state maximizations, where the sequence of states can be
interpreted as "samples" from an associated MRF model. Herding differs from
maximum likelihood estimation in that the sequence of parameters does not
converge to a fixed point and differs from an MCMC posterior sampling approach
in that the sequence of states is generated deterministically. Herding may be
interpreted as a"perturb and map" method where the parameter perturbations are
generated using a deterministic nonlinear dynamical system rather than randomly
from a Gumbel distribution. This chapter studies the distinct statistical
characteristics of the herding algorithm and shows that the fast convergence
rate of the controlled moments may be attributed to edge of chaos dynamics. The
herding algorithm can also be generalized to models with latent variables and
to a discriminative learning setting. The perceptron cycling theorem ensures
that the fast moment matching property is preserved in the more general
framework
High Dimensional Classification with combined Adaptive Sparse PLS and Logistic Regression
Motivation: The high dimensionality of genomic data calls for the development
of specific classification methodologies, especially to prevent over-optimistic
predictions. This challenge can be tackled by compression and variable
selection, which combined constitute a powerful framework for classification,
as well as data visualization and interpretation. However, current proposed
combinations lead to instable and non convergent methods due to inappropriate
computational frameworks. We hereby propose a stable and convergent approach
for classification in high dimensional based on sparse Partial Least Squares
(sparse PLS). Results: We start by proposing a new solution for the sparse PLS
problem that is based on proximal operators for the case of univariate
responses. Then we develop an adaptive version of the sparse PLS for
classification, which combines iterative optimization of logistic regression
and sparse PLS to ensure convergence and stability. Our results are confirmed
on synthetic and experimental data. In particular we show how crucial
convergence and stability can be when cross-validation is involved for
calibration purposes. Using gene expression data we explore the prediction of
breast cancer relapse. We also propose a multicategorial version of our method
on the prediction of cell-types based on single-cell expression data.
Availability: Our approach is implemented in the plsgenomics R-package.Comment: 9 pages, 3 figures, 4 tables + Supplementary Materials 8 pages, 3
figures, 10 table
A new method for the estimation of variance matrix with prescribed zeros in nonlinear mixed effects models
We propose a new method for the Maximum Likelihood Estimator (MLE) of
nonlinear mixed effects models when the variance matrix of Gaussian random
effects has a prescribed pattern of zeros (PPZ). The method consists in
coupling the recently developed Iterative Conditional Fitting (ICF) algorithm
with the Expectation Maximization (EM) algorithm. It provides positive definite
estimates for any sample size, and does not rely on any structural assumption
on the PPZ. It can be easily adapted to many versions of EM.Comment: Accepted for publication in Statistics and Computin
Polytope of Correct (Linear Programming) Decoding and Low-Weight Pseudo-Codewords
We analyze Linear Programming (LP) decoding of graphical binary codes
operating over soft-output, symmetric and log-concave channels. We show that
the error-surface, separating domain of the correct decoding from domain of the
erroneous decoding, is a polytope. We formulate the problem of finding the
lowest-weight pseudo-codeword as a non-convex optimization (maximization of a
convex function) over a polytope, with the cost function defined by the channel
and the polytope defined by the structure of the code. This formulation
suggests new provably convergent heuristics for finding the lowest weight
pseudo-codewords improving in quality upon previously discussed. The algorithm
performance is tested on the example of the Tanner [155, 64, 20] code over the
Additive White Gaussian Noise (AWGN) channel.Comment: 6 pages, 2 figures, accepted for IEEE ISIT 201
Accuracy of MAP segmentation with hidden Potts and Markov mesh prior models via Path Constrained Viterbi Training, Iterated Conditional Modes and Graph Cut based algorithms
In this paper, we study statistical classification accuracy of two different
Markov field environments for pixelwise image segmentation, considering the
labels of the image as hidden states and solving the estimation of such labels
as a solution of the MAP equation. The emission distribution is assumed the
same in all models, and the difference lays in the Markovian prior hypothesis
made over the labeling random field. The a priori labeling knowledge will be
modeled with a) a second order anisotropic Markov Mesh and b) a classical
isotropic Potts model. Under such models, we will consider three different
segmentation procedures, 2D Path Constrained Viterbi training for the Hidden
Markov Mesh, a Graph Cut based segmentation for the first order isotropic Potts
model, and ICM (Iterated Conditional Modes) for the second order isotropic
Potts model.
We provide a unified view of all three methods, and investigate goodness of
fit for classification, studying the influence of parameter estimation,
computational gain, and extent of automation in the statistical measures
Overall Accuracy, Relative Improvement and Kappa coefficient, allowing robust
and accurate statistical analysis on synthetic and real-life experimental data
coming from the field of Dental Diagnostic Radiography. All algorithms, using
the learned parameters, generate good segmentations with little interaction
when the images have a clear multimodal histogram. Suboptimal learning proves
to be frail in the case of non-distinctive modes, which limits the complexity
of usable models, and hence the achievable error rate as well.
All Matlab code written is provided in a toolbox available for download from
our website, following the Reproducible Research Paradigm
- …