7,545 research outputs found
Some steps towards a general principle for dimensionality reduction mappings
In the past years, many dimensionality reduction methods have been
established which allow to visualize high dimensional data sets. Recently,
also formal evaluation schemes have been proposed for data visualization,
which allow a quantitative evaluation along general principles. Most techniques
provide a mapping of a priorly given finite set of points only, requiring
additional steps for out-of-sample extensions. We propose a general
view on dimensionality reduction based on the concept of cost functions,
and, based on this general principle, extend dimensionality reduction to
explicit mappings of the data manifold. This offers the possibility of simple
out-of-sample extensions. Further, it opens a way towards a theory
of data visualization taking the perspective of its generalization ability
to new data points. We demonstrate the approach based in a simple
example
Dimensionality Reduction Mappings
A wealth of powerful dimensionality reduction methods has been established which can be used for data visualization and preprocessing. These are accompanied by formal evaluation schemes, which allow a quantitative evaluation along general principles and which even lead to further visualization schemes based on these objectives. Most methods, however, provide a mapping of a priorly given finite set of points only, requiring additional steps for out-of-sample extensions. We propose a general view on dimensionality reduction based on the concept of cost functions, and, based on this general principle, extend dimensionality reduction to explicit mappings of the data manifold. This offers simple out-of-sample extensions. Further, it opens a way towards a theory of data visualization taking the perspective of its generalization ability to new data points. We demonstrate the approach based on a simple global linear mapping as well as prototype-based local linear mappings.
Probing Ultrafast Dynamics with Time-resolved Multi-dimensional Coincidence Imaging: Butadiene
Time-resolved coincidence imaging of photoelectrons and photoions represents
the most complete experimental measurement of ultrafast excited state dynamics,
a multi-dimensional measurement for a multi-dimensional problem. Here we
present the experimental data from recent coincidence imaging experiments,
undertaken with the aim of gaining insight into the complex ultrafast
excited-state dynamics of 1,3-butadiene initiated by absorption of 200 nm
light. We discuss photoion and photoelectron mappings of increasing
dimensionality, and focus particularly on the time-resolved photoelectron
angular distributions (TRPADs), expected to be a sensitive probe of the
electronic evolution of the excited state and to provide significant
information beyond the time-resolved photoelectron spectrum (TRPES). Complex
temporal behaviour is observed in the TRPADs, revealing their sensitivity to
the dynamics while also emphasising the difficulty of interpretation of these
complex observables. From the experimental data some details of the wavepacket
dynamics are discerned relatively directly, and we make some tentative
comparisons with existing ab initio calculations in order to gain deeper
insight into the experimental measurements; finally, we sketch out some
considerations for taking this comparison further in order to bridge the gap
between experiment and theory.Comment: 18 pages, 10 figures. Pre-print of JMO submissio
The coupled-cluster approach to quantum many-body problem in a three-Hilbert-space reinterpretation
The quantum many-body bound-state problem in its computationally successful
coupled cluster method (CCM) representation is reconsidered. In conventional
practice one factorizes the ground-state wave functions which live in the "physical" Hilbert space using
an elementary ansatz for plus a formal expansion of in an
operator basis of multi-configurational creation operators. In our paper a
reinterpretation of the method is proposed. Using parallels between the CCM and
the so called quasi-Hermitian, alias three-Hilbert-space (THS), quantum
mechanics, the CCM transition from the known microscopic Hamiltonian (denoted
by usual symbol ), which is self-adjoint in , to its
effective lower-case isospectral avatar , is assigned a
THS interpretation. In the opposite direction, a THS-prescribed, non-CCM,
innovative reinstallation of Hermiticity is shown to be possible for the CCM
effective Hamiltonian , which only appears manifestly non-Hermitian in
its own ("friendly") Hilbert space . This goal is achieved via
an ad hoc amendment of the inner product in , thereby yielding
the third ("standard") Hilbert space . Due to the resulting
exact unitary equivalence between the first and third spaces, , the indistinguishability of predictions
calculated in these alternative physical frameworks is guaranteed.Comment: 15 page
10302 Abstracts Collection -- Learning paradigms in dynamic environments
From 25.07. to 30.07.2010, the Dagstuhl Seminar 10302 ``Learning paradigms in dynamic environments \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Similarity Learning for High-Dimensional Sparse Data
A good measure of similarity between data points is crucial to many tasks in
machine learning. Similarity and metric learning methods learn such measures
automatically from data, but they do not scale well respect to the
dimensionality of the data. In this paper, we propose a method that can learn
efficiently similarity measure from high-dimensional sparse data. The core idea
is to parameterize the similarity measure as a convex combination of rank-one
matrices with specific sparsity structures. The parameters are then optimized
with an approximate Frank-Wolfe procedure to maximally satisfy relative
similarity constraints on the training data. Our algorithm greedily
incorporates one pair of features at a time into the similarity measure,
providing an efficient way to control the number of active features and thus
reduce overfitting. It enjoys very appealing convergence guarantees and its
time and memory complexity depends on the sparsity of the data instead of the
dimension of the feature space. Our experiments on real-world high-dimensional
datasets demonstrate its potential for classification, dimensionality reduction
and data exploration.Comment: 14 pages. Proceedings of the 18th International Conference on
Artificial Intelligence and Statistics (AISTATS 2015). Matlab code:
https://github.com/bellet/HDS
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis
Factor analysis aims to determine latent factors, or traits, which summarize
a given data set. Inter-battery factor analysis extends this notion to multiple
views of the data. In this paper we show how a nonlinear, nonparametric version
of these models can be recovered through the Gaussian process latent variable
model. This gives us a flexible formalism for multi-view learning where the
latent variables can be used both for exploratory purposes and for learning
representations that enable efficient inference for ambiguous estimation tasks.
Learning is performed in a Bayesian manner through the formulation of a
variational compression scheme which gives a rigorous lower bound on the log
likelihood. Our Bayesian framework provides strong regularization during
training, allowing the structure of the latent space to be determined
efficiently and automatically. We demonstrate this by producing the first (to
our knowledge) published results of learning from dozens of views, even when
data is scarce. We further show experimental results on several different types
of multi-view data sets and for different kinds of tasks, including exploratory
data analysis, generation, ambiguity modelling through latent priors and
classification.Comment: 49 pages including appendi
- …