16,613 research outputs found
Bayesian Inference on Matrix Manifolds for Linear Dimensionality Reduction
We reframe linear dimensionality reduction as a problem of Bayesian inference
on matrix manifolds. This natural paradigm extends the Bayesian framework to
dimensionality reduction tasks in higher dimensions with simpler models at
greater speeds. Here an orthogonal basis is treated as a single point on a
manifold and is associated with a linear subspace on which observations vary
maximally. Throughout this paper, we employ the Grassmann and Stiefel manifolds
for various dimensionality reduction problems, explore the connection between
the two manifolds, and use Hybrid Monte Carlo for posterior sampling on the
Grassmannian for the first time. We delineate in which situations either
manifold should be considered. Further, matrix manifold models are used to
yield scientific insight in the context of cognitive neuroscience, and we
conclude that our methods are suitable for basic inference as well as accurate
prediction.Comment: All datasets and computer programs are publicly available at
http://www.ics.uci.edu/~babaks/Site/Codes.htm
Machine Learning for Fluid Mechanics
The field of fluid mechanics is rapidly advancing, driven by unprecedented
volumes of data from field measurements, experiments and large-scale
simulations at multiple spatiotemporal scales. Machine learning offers a wealth
of techniques to extract information from data that could be translated into
knowledge about the underlying fluid mechanics. Moreover, machine learning
algorithms can augment domain knowledge and automate tasks related to flow
control and optimization. This article presents an overview of past history,
current developments, and emerging opportunities of machine learning for fluid
mechanics. It outlines fundamental machine learning methodologies and discusses
their uses for understanding, modeling, optimizing, and controlling fluid
flows. The strengths and limitations of these methods are addressed from the
perspective of scientific inquiry that considers data as an inherent part of
modeling, experimentation, and simulation. Machine learning provides a powerful
information processing framework that can enrich, and possibly even transform,
current lines of fluid mechanics research and industrial applications.Comment: To appear in the Annual Reviews of Fluid Mechanics, 202
A Survey on Metric Learning for Feature Vectors and Structured Data
The need for appropriate ways to measure the distance or similarity between
data is ubiquitous in machine learning, pattern recognition and data mining,
but handcrafting such good metrics for specific problems is generally
difficult. This has led to the emergence of metric learning, which aims at
automatically learning a metric from data and has attracted a lot of interest
in machine learning and related fields for the past ten years. This survey
paper proposes a systematic review of the metric learning literature,
highlighting the pros and cons of each approach. We pay particular attention to
Mahalanobis distance metric learning, a well-studied and successful framework,
but additionally present a wide range of methods that have recently emerged as
powerful alternatives, including nonlinear metric learning, similarity learning
and local metric learning. Recent trends and extensions, such as
semi-supervised metric learning, metric learning for histogram data and the
derivation of generalization guarantees, are also covered. Finally, this survey
addresses metric learning for structured data, in particular edit distance
learning, and attempts to give an overview of the remaining challenges in
metric learning for the years to come.Comment: Technical report, 59 pages. Changes in v2: fixed typos and improved
presentation. Changes in v3: fixed typos. Changes in v4: fixed typos and new
method
Transforming Graph Representations for Statistical Relational Learning
Relational data representations have become an increasingly important topic
due to the recent proliferation of network datasets (e.g., social, biological,
information networks) and a corresponding increase in the application of
statistical relational learning (SRL) algorithms to these domains. In this
article, we examine a range of representation issues for graph-based relational
data. Since the choice of relational data representation for the nodes, links,
and features can dramatically affect the capabilities of SRL algorithms, we
survey approaches and opportunities for relational representation
transformation designed to improve the performance of these algorithms. This
leads us to introduce an intuitive taxonomy for data representation
transformations in relational domains that incorporates link transformation and
node transformation as symmetric representation tasks. In particular, the
transformation tasks for both nodes and links include (i) predicting their
existence, (ii) predicting their label or type, (iii) estimating their weight
or importance, and (iv) systematically constructing their relevant features. We
motivate our taxonomy through detailed examples and use it to survey and
compare competing approaches for each of these tasks. We also discuss general
conditions for transforming links, nodes, and features. Finally, we highlight
challenges that remain to be addressed
Multi-Scale Link Prediction
The automated analysis of social networks has become an important problem due
to the proliferation of social networks, such as LiveJournal, Flickr and
Facebook. The scale of these social networks is massive and continues to grow
rapidly. An important problem in social network analysis is proximity
estimation that infers the closeness of different users. Link prediction, in
turn, is an important application of proximity estimation. However, many
methods for computing proximity measures have high computational complexity and
are thus prohibitive for large-scale link prediction problems. One way to
address this problem is to estimate proximity measures via low-rank
approximation. However, a single low-rank approximation may not be sufficient
to represent the behavior of the entire network. In this paper, we propose
Multi-Scale Link Prediction (MSLP), a framework for link prediction, which can
handle massive networks. The basis idea of MSLP is to construct low rank
approximations of the network at multiple scales in an efficient manner. Based
on this approach, MSLP combines predictions at multiple scales to make robust
and accurate predictions. Experimental results on real-life datasets with more
than a million nodes show the superior performance and scalability of our
method.Comment: 20 pages, 10 figure
- …