76,555 research outputs found
Demonstration of Enhanced Monte Carlo Computation of the Fisher Information for Complex Problems
The Fisher information matrix summarizes the amount of information in a set
of data relative to the quantities of interest. There are many applications of
the information matrix in statistical modeling, system identification and
parameter estimation. This short paper reviews a feedback-based method and an
independent perturbation approach for computing the information matrix for
complex problems, where a closed form of the information matrix is not
achievable. We show through numerical examples how these methods improve the
accuracy of the estimate of the information matrix compared to the basic
resampling-based approach. Some relevant theory is summarized
Fractional norms and quasinorms do not help to overcome the curse of dimensionality
The curse of dimensionality causes the well-known and widely discussed
problems for machine learning methods. There is a hypothesis that using of the
Manhattan distance and even fractional quasinorms lp (for p less than 1) can
help to overcome the curse of dimensionality in classification problems. In
this study, we systematically test this hypothesis. We confirm that fractional
quasinorms have a greater relative contrast or coefficient of variation than
the Euclidean norm l2, but we also demonstrate that the distance concentration
shows qualitatively the same behaviour for all tested norms and quasinorms and
the difference between them decays as dimension tends to infinity. Estimation
of classification quality for kNN based on different norms and quasinorms shows
that a greater relative contrast does not mean better classifier performance
and the worst performance for different databases was shown by different norms
(quasinorms). A systematic comparison shows that the difference of the
performance of kNN based on lp for p=2, 1, and 0.5 is statistically
insignificant
Sample Complexity of Dictionary Learning and other Matrix Factorizations
Many modern tools in machine learning and signal processing, such as sparse
dictionary learning, principal component analysis (PCA), non-negative matrix
factorization (NMF), -means clustering, etc., rely on the factorization of a
matrix obtained by concatenating high-dimensional vectors from a training
collection. While the idealized task would be to optimize the expected quality
of the factors over the underlying distribution of training vectors, it is
achieved in practice by minimizing an empirical average over the considered
collection. The focus of this paper is to provide sample complexity estimates
to uniformly control how much the empirical average deviates from the expected
cost function. Standard arguments imply that the performance of the empirical
predictor also exhibit such guarantees. The level of genericity of the approach
encompasses several possible constraints on the factors (tensor product
structure, shift-invariance, sparsity \ldots), thus providing a unified
perspective on the sample complexity of several widely used matrix
factorization schemes. The derived generalization bounds behave proportional to
w.r.t.\ the number of samples for the considered matrix
factorization techniques.Comment: to appea
Iterative Row Sampling
There has been significant interest and progress recently in algorithms that
solve regression problems involving tall and thin matrices in input sparsity
time. These algorithms find shorter equivalent of a n*d matrix where n >> d,
which allows one to solve a poly(d) sized problem instead. In practice, the
best performances are often obtained by invoking these routines in an iterative
fashion. We show these iterative methods can be adapted to give theoretical
guarantees comparable and better than the current state of the art.
Our approaches are based on computing the importances of the rows, known as
leverage scores, in an iterative manner. We show that alternating between
computing a short matrix estimate and finding more accurate approximate
leverage scores leads to a series of geometrically smaller instances. This
gives an algorithm that runs in
time for any , where the term is comparable
to the cost of solving a regression problem on the small approximation. Our
results are built upon the close connection between randomized matrix
algorithms, iterative methods, and graph sparsification.Comment: 26 pages, 2 figure
A large covariance matrix estimator under intermediate spikiness regimes
The present paper concerns large covariance matrix estimation via composite
minimization under the assumption of low rank plus sparse structure. In this
approach, the low rank plus sparse decomposition of the covariance matrix is
recovered by least squares minimization under nuclear norm plus norm
penalization. This paper proposes a new estimator of that family based on an
additional least-squares re-optimization step aimed at un-shrinking the
eigenvalues of the low rank component estimated at the first step. We prove
that such un-shrinkage causes the final estimate to approach the target as
closely as possible in Frobenius norm while recovering exactly the underlying
low rank and sparsity pattern. Consistency is guaranteed when is at least
, provided that the maximum number of non-zeros per
row in the sparse component is with .
Consistent recovery is ensured if the latent eigenvalues scale to ,
, while rank consistency is ensured if .
The resulting estimator is called UNALCE (UNshrunk ALgebraic Covariance
Estimator) and is shown to outperform state of the art estimators, especially
for what concerns fitting properties and sparsity pattern detection. The
effectiveness of UNALCE is highlighted on a real example regarding ECB banking
supervisory data
- …