11 research outputs found
Finite sample approximation results for principal component analysis: a matrix perturbation approach
Principal component analysis (PCA) is a standard tool for dimensional
reduction of a set of observations (samples), each with variables. In
this paper, using a matrix perturbation approach, we study the nonasymptotic
relation between the eigenvalues and eigenvectors of PCA computed on a finite
sample of size , and those of the limiting population PCA as .
As in machine learning, we present a finite sample theorem which holds with
high probability for the closeness between the leading eigenvalue and
eigenvector of sample PCA and population PCA under a spiked covariance model.
In addition, we also consider the relation between finite sample PCA and the
asymptotic results in the joint limit , with . We present
a matrix perturbation view of the "phase transition phenomenon," and a simple
linear-algebra based derivation of the eigenvalue and eigenvector overlap in
this asymptotic limit. Moreover, our analysis also applies for finite
where we show that although there is no sharp phase transition as in the
infinite case, either as a function of noise level or as a function of sample
size , the eigenvector of sample PCA may exhibit a sharp "loss of tracking,"
suddenly losing its relation to the (true) eigenvector of the population PCA
matrix. This occurs due to a crossover between the eigenvalue due to the signal
and the largest eigenvalue due to noise, whose eigenvector points in a random
direction.Comment: Published in at http://dx.doi.org/10.1214/08-AOS618 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Mathematical Foundations of Machine Learning (hybrid meeting)
Machine learning has achieved
remarkable successes in various applications, but there is wide agreement that a mathematical theory for deep learning is missing. Recently, some first mathematical results have been derived in different areas such as mathematical statistics and statistical learning. Any mathematical theory of machine learning will have to combine tools from different fields such as nonparametric statistics, high-dimensional statistics, empirical process theory and approximation theory. The main objective of the workshop was to bring together leading researchers contributing to the mathematics of machine learning.
A focus of the workshop was on theory for deep neural networks. Mathematically speaking, neural networks define function classes with a rich mathematical structure that are extremely difficult to analyze because of non-linearity in the parameters. Until very recently, most existing theoretical results could not cope with many of the distinctive characteristics of deep networks such as multiple hidden layers or the ReLU activation function. Other topics of the workshop are procedures for quantifying the uncertainty of machine learning methods and the mathematics of data privacy
Sparse machine learning methods with applications in multivariate signal processing
This thesis details theoretical and empirical work that draws from two main subject areas: Machine
Learning (ML) and Digital Signal Processing (DSP). A unified general framework is given for the application
of sparse machine learning methods to multivariate signal processing. In particular, methods that
enforce sparsity will be employed for reasons of computational efficiency, regularisation, and compressibility.
The methods presented can be seen as modular building blocks that can be applied to a variety
of applications. Application specific prior knowledge can be used in various ways, resulting in a flexible
and powerful set of tools. The motivation for the methods is to be able to learn and generalise from a set
of multivariate signals.
In addition to testing on benchmark datasets, a series of empirical evaluations on real world
datasets were carried out. These included: the classification of musical genre from polyphonic audio
files; a study of how the sampling rate in a digital radar can be reduced through the use of Compressed
Sensing (CS); analysis of human perception of different modulations of musical key from
Electroencephalography (EEG) recordings; classification of genre of musical pieces to which a listener
is attending from Magnetoencephalography (MEG) brain recordings. These applications demonstrate
the efficacy of the framework and highlight interesting directions of future research
Scaling Multidimensional Inference for Big Structured Data
In information technology, big data is a collection of data sets so large and complex that it becomes difficult to process using traditional data processing applications [151]. In a
world of increasing sensor modalities, cheaper storage, and more data oriented questions, we are quickly passing the limits of tractable computations using traditional statistical analysis
methods. Methods which often show great results on simple data have difficulties processing complicated multidimensional data. Accuracy alone can no longer justify unwarranted memory
use and computational complexity. Improving the scaling properties of these methods for multidimensional data is the only way to make these methods relevant. In this work we explore methods for improving the scaling properties of parametric and nonparametric
models. Namely, we focus on the structure of the data to lower the complexity of a specific family of problems. The two types of structures considered in this work are distributive
optimization with separable constraints (Chapters 2-3), and scaling Gaussian processes for multidimensional lattice input (Chapters 4-5). By improving the scaling of these methods, we can expand their use to a wide range of applications which were previously intractable
open the door to new research questions
High-dimensional functional data/time series analysis: finite-sample theory, adaptive functional thresholding and prediction
Statistical analysis of high-dimensional functional data/times series arises in various applications. Examples include different types of brain imaging data in neuroscience (Zhu et al., 2016; Li and Solea, 2018), agespecific mortality rates for different prefectures (Gao et al., 2019a) and intraday energy consumption trajectories (Cho et al., 2013) for thousands of households, to list a few. Under this scenario, in addition to the intrinsic infinite-dimensionality of functional data, the number of functional variables can grow with the number of independent or serially dependent observations, posing new challenges to existing work. In this thesis, we consider three fundamental tasks in high-dimensional functional data/times series analysis: finite sample theory, covariance function estimation (with a new class of adaptive functional thresholding operators) and modelling/prediction. In the first chapter, we focus on the theoretical analysis of relevant estimated cross-(auto)covariance terms between two multivariate functional time series or a mixture of multivariate functional and scalar time series beyond the Gaussianity assumption. We introduce a new perspective on dependence by proposing functional cross-spectral stability measure to characterize the effect of dependence on these estimated cross terms, which are essential in the estimates for additive functional linear regressions. With the proposed functional cross-spectral stability measure, we develop useful concentration inequalities for estimated cross- (auto)covariance matrix functions to accommodate more general sub- Gaussian functional linear processes and, furthermore, establish finite sample theory for relevant estimated terms under a commonly adopted functional principal component analysis framework. Using our derived non-asymptotic results, we investigate the convergence properties of the regularized estimates for two additive functional linear regression applications under sparsity assumptions including functional linear lagged regression and partially functional linear regression in the context of high-dimensional functional/scalar time series
高次元データ解析のためのシングルステップ次元削減とその強化学習への応用
学位の種別: 課程博士審査委員会委員 : (主査)東京大学教授 五十嵐 健夫, 東京大学教授 井元 清哉, 東京大学教授 中川 裕志, 東京大学講師 中山 英樹, 沖縄科学技術大学院大学教授 銅谷 賢治University of Tokyo(東京大学