6,229 research outputs found
Exact Dimensionality Selection for Bayesian PCA
We present a Bayesian model selection approach to estimate the intrinsic
dimensionality of a high-dimensional dataset. To this end, we introduce a novel
formulation of the probabilisitic principal component analysis model based on a
normal-gamma prior distribution. In this context, we exhibit a closed-form
expression of the marginal likelihood which allows to infer an optimal number
of components. We also propose a heuristic based on the expected shape of the
marginal likelihood curve in order to choose the hyperparameters. In
non-asymptotic frameworks, we show on simulated data that this exact
dimensionality selection approach is competitive with both Bayesian and
frequentist state-of-the-art methods
Representation Learning: A Review and New Perspectives
The success of machine learning algorithms generally depends on data
representation, and we hypothesize that this is because different
representations can entangle and hide more or less the different explanatory
factors of variation behind the data. Although specific domain knowledge can be
used to help design representations, learning with generic priors can also be
used, and the quest for AI is motivating the design of more powerful
representation-learning algorithms implementing such priors. This paper reviews
recent work in the area of unsupervised feature learning and deep learning,
covering advances in probabilistic models, auto-encoders, manifold learning,
and deep networks. This motivates longer-term unanswered questions about the
appropriate objectives for learning good representations, for computing
representations (i.e., inference), and the geometrical connections between
representation learning, density estimation and manifold learning
Exploring Algorithmic Limits of Matrix Rank Minimization under Affine Constraints
Many applications require recovering a matrix of minimal rank within an
affine constraint set, with matrix completion a notable special case. Because
the problem is NP-hard in general, it is common to replace the matrix rank with
the nuclear norm, which acts as a convenient convex surrogate. While elegant
theoretical conditions elucidate when this replacement is likely to be
successful, they are highly restrictive and convex algorithms fail when the
ambient rank is too high or when the constraint set is poorly structured.
Non-convex alternatives fare somewhat better when carefully tuned; however,
convergence to locally optimal solutions remains a continuing source of
failure. Against this backdrop we derive a deceptively simple and
parameter-free probabilistic PCA-like algorithm that is capable, over a wide
battery of empirical tests, of successful recovery even at the theoretical
limit where the number of measurements equal the degrees of freedom in the
unknown low-rank matrix. Somewhat surprisingly, this is possible even when the
affine constraint set is highly ill-conditioned. While proving general recovery
guarantees remains evasive for non-convex algorithms, Bayesian-inspired or
otherwise, we nonetheless show conditions whereby the underlying cost function
has a unique stationary point located at the global optimum; no existing cost
function we are aware of satisfies this same property. We conclude with a
simple computer vision application involving image rectification and a standard
collaborative filtering benchmark
Using Robust PCA to estimate regional characteristics of language use from geo-tagged Twitter messages
Principal component analysis (PCA) and related techniques have been
successfully employed in natural language processing. Text mining applications
in the age of the online social media (OSM) face new challenges due to
properties specific to these use cases (e.g. spelling issues specific to texts
posted by users, the presence of spammers and bots, service announcements,
etc.). In this paper, we employ a Robust PCA technique to separate typical
outliers and highly localized topics from the low-dimensional structure present
in language use in online social networks. Our focus is on identifying
geospatial features among the messages posted by the users of the Twitter
microblogging service. Using a dataset which consists of over 200 million
geolocated tweets collected over the course of a year, we investigate whether
the information present in word usage frequencies can be used to identify
regional features of language use and topics of interest. Using the PCA pursuit
method, we are able to identify important low-dimensional features, which
constitute smoothly varying functions of the geographic location
A Generative-Discriminative Basis Learning Framework to Predict Clinical Severity from Resting State Functional MRI Data
We propose a matrix factorization technique that decomposes the resting state
fMRI (rs-fMRI) correlation matrices for a patient population into a sparse set
of representative subnetworks, as modeled by rank one outer products. The
subnetworks are combined using patient specific non-negative coefficients;
these coefficients are also used to model, and subsequently predict the
clinical severity of a given patient via a linear regression. Our
generative-discriminative framework is able to exploit the structure of rs-fMRI
correlation matrices to capture group level effects, while simultaneously
accounting for patient variability. We employ ten fold cross validation to
demonstrate the predictive power of our model on a cohort of fifty eight
patients diagnosed with Autism Spectrum Disorder. Our method outperforms
classical semi-supervised frameworks, which perform dimensionality reduction on
the correlation features followed by non-linear regression to predict the
clinical scores
- …