4,017 research outputs found
Learning to select data for transfer learning with Bayesian Optimization
Domain similarity measures can be used to gauge adaptability and select
suitable data for transfer learning, but existing approaches define ad hoc
measures that are deemed suitable for respective tasks. Inspired by work on
curriculum learning, we propose to \emph{learn} data selection measures using
Bayesian Optimization and evaluate them across models, domains and tasks. Our
learned measures outperform existing domain similarity measures significantly
on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We
show the importance of complementing similarity with diversity, and that
learned measures are -- to some degree -- transferable across models, domains,
and even tasks.Comment: EMNLP 2017. Code available at:
https://github.com/sebastianruder/learn-to-select-dat
-MLE: A fast algorithm for learning statistical mixture models
We describe -MLE, a fast and efficient local search algorithm for learning
finite statistical mixtures of exponential families such as Gaussian mixture
models. Mixture models are traditionally learned using the
expectation-maximization (EM) soft clustering technique that monotonically
increases the incomplete (expected complete) likelihood. Given prescribed
mixture weights, the hard clustering -MLE algorithm iteratively assigns data
to the most likely weighted component and update the component models using
Maximum Likelihood Estimators (MLEs). Using the duality between exponential
families and Bregman divergences, we prove that the local convergence of the
complete likelihood of -MLE follows directly from the convergence of a dual
additively weighted Bregman hard clustering. The inner loop of -MLE can be
implemented using any -means heuristic like the celebrated Lloyd's batched
or Hartigan's greedy swap updates. We then show how to update the mixture
weights by minimizing a cross-entropy criterion that implies to update weights
by taking the relative proportion of cluster points, and reiterate the mixture
parameter update and mixture weight update processes until convergence. Hard EM
is interpreted as a special case of -MLE when both the component update and
the weight update are performed successively in the inner loop. To initialize
-MLE, we propose -MLE++, a careful initialization of -MLE guaranteeing
probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201
Metric for attractor overlap
We present the first general metric for attractor overlap (MAO) facilitating
an unsupervised comparison of flow data sets. The starting point is two or more
attractors, i.e., ensembles of states representing different operating
conditions. The proposed metric generalizes the standard Hilbert-space distance
between two snapshots to snapshot ensembles of two attractors. A reduced-order
analysis for big data and many attractors is enabled by coarse-graining the
snapshots into representative clusters with corresponding centroids and
population probabilities. For a large number of attractors, MAO is augmented by
proximity maps for the snapshots, the centroids, and the attractors, giving
scientifically interpretable visual access to the closeness of the states. The
coherent structures belonging to the overlap and disjoint states between these
attractors are distilled by few representative centroids. We employ MAO for two
quite different actuated flow configurations: (1) a two-dimensional wake of the
fluidic pinball with vortices in a narrow frequency range and (2)
three-dimensional wall turbulence with broadband frequency spectrum manipulated
by spanwise traveling transversal surface waves. MAO compares and classifies
these actuated flows in agreement with physical intuition. For instance, the
first feature coordinate of the attractor proximity map correlates with drag
for the fluidic pinball and for the turbulent boundary layer. MAO has a large
spectrum of potential applications ranging from a quantitative comparison
between numerical simulations and experimental particle-image velocimetry data
to the analysis of simulations representing a myriad of different operating
conditions.Comment: 33 pages, 20 figure
- …