Search CORE

76,555 research outputs found

Demonstration of Enhanced Monte Carlo Computation of the Fisher Information for Complex Problems

Author: Cao Xumeng
Publication venue
Publication date: 06/05/2014
Field of study

The Fisher information matrix summarizes the amount of information in a set of data relative to the quantities of interest. There are many applications of the information matrix in statistical modeling, system identification and parameter estimation. This short paper reviews a feedback-based method and an independent perturbation approach for computing the information matrix for complex problems, where a closed form of the information matrix is not achievable. We show through numerical examples how these methods improve the accuracy of the estimate of the information matrix compared to the basic resampling-based approach. Some relevant theory is summarized

arXiv.org e-Print Archive

Fractional norms and quasinorms do not help to overcome the curse of dimensionality

Author: Allohibi Jeza
Gorban Alexander N.
Mirkes Evgeny M.
Publication venue: 'MDPI AG'
Publication date: 29/04/2020
Field of study

The curse of dimensionality causes the well-known and widely discussed problems for machine learning methods. There is a hypothesis that using of the Manhattan distance and even fractional quasinorms lp (for p less than 1) can help to overcome the curse of dimensionality in classification problems. In this study, we systematically test this hypothesis. We confirm that fractional quasinorms have a greater relative contrast or coefficient of variation than the Euclidean norm l2, but we also demonstrate that the distance concentration shows qualitatively the same behaviour for all tested norms and quasinorms and the difference between them decays as dimension tends to infinity. Estimation of classification quality for kNN based on different norms and quasinorms shows that a greater relative contrast does not mean better classifier performance and the worst performance for different databases was shown by different norms (quasinorms). A systematic comparison shows that the difference of the performance of kNN based on lp for p=2, 1, and 0.5 is statistically insignificant

arXiv.org e-Print Archive

Sample Complexity of Dictionary Learning and other Matrix Factorizations

Author: Bach Francis
Gribonval Rémi
Jenatton Rodolphe
Kleinsteuber Martin
Seibert Matthias
Publication venue
Publication date: 01/01/2015
Field of study

Many modern tools in machine learning and signal processing, such as sparse dictionary learning, principal component analysis (PCA), non-negative matrix factorization (NMF),

K

-means clustering, etc., rely on the factorization of a matrix obtained by concatenating high-dimensional vectors from a training collection. While the idealized task would be to optimize the expected quality of the factors over the underlying distribution of training vectors, it is achieved in practice by minimizing an empirical average over the considered collection. The focus of this paper is to provide sample complexity estimates to uniformly control how much the empirical average deviates from the expected cost function. Standard arguments imply that the performance of the empirical predictor also exhibit such guarantees. The level of genericity of the approach encompasses several possible constraints on the factors (tensor product structure, shift-invariance, sparsity \ldots), thus providing a unified perspective on the sample complexity of several widely used matrix factorization schemes. The derived generalization bounds behave proportional to

\sqrt{\log(n)/n}

w.r.t.\ the number of samples

n

for the considered matrix factorization techniques.Comment: to appea

arXiv.org e-Print Archive

HAL-CentraleSupelec

INRIA a CCSD electronic archive server

HAL-Rennes 1

Iterative Row Sampling

Author: Li Mu
Miller Gary L.
Peng Richard
Publication venue
Publication date: 01/01/2013
Field of study

There has been significant interest and progress recently in algorithms that solve regression problems involving tall and thin matrices in input sparsity time. These algorithms find shorter equivalent of a n*d matrix where n >> d, which allows one to solve a poly(d) sized problem instead. In practice, the best performances are often obtained by invoking these routines in an iterative fashion. We show these iterative methods can be adapted to give theoretical guarantees comparable and better than the current state of the art. Our approaches are based on computing the importances of the rows, known as leverage scores, in an iterative manner. We show that alternating between computing a short matrix estimate and finding more accurate approximate leverage scores leads to a series of geometrically smaller instances. This gives an algorithm that runs in

O(nnz(A) + d^{\omega + \theta} \epsilon^{-2})

time for any

\theta > 0

, where the

d^{\omega + \theta}

term is comparable to the cost of solving a regression problem on the small approximation. Our results are built upon the close connection between randomized matrix algorithms, iterative methods, and graph sparsification.Comment: 26 pages, 2 figure

arXiv.org e-Print Archive

A large covariance matrix estimator under intermediate spikiness regimes

Author: Farnè Matteo
Montanari Angela
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

The present paper concerns large covariance matrix estimation via composite minimization under the assumption of low rank plus sparse structure. In this approach, the low rank plus sparse decomposition of the covariance matrix is recovered by least squares minimization under nuclear norm plus

l_1

norm penalization. This paper proposes a new estimator of that family based on an additional least-squares re-optimization step aimed at un-shrinking the eigenvalues of the low rank component estimated at the first step. We prove that such un-shrinkage causes the final estimate to approach the target as closely as possible in Frobenius norm while recovering exactly the underlying low rank and sparsity pattern. Consistency is guaranteed when

n

is at least

O(p^{\frac{3}{2}\delta})

, provided that the maximum number of non-zeros per row in the sparse component is

O(p^{\delta})

with

\delta \leq \frac{1}{2}

. Consistent recovery is ensured if the latent eigenvalues scale to

p^{\alpha}

\alpha \in[0,1]

, while rank consistency is ensured if

\delta \leq \alpha

. The resulting estimator is called UNALCE (UNshrunk ALgebraic Covariance Estimator) and is shown to outperform state of the art estimators, especially for what concerns fitting properties and sparsity pattern detection. The effectiveness of UNALCE is highlighted on a real example regarding ECB banking supervisory data

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna