Search CORE

4,017 research outputs found

Learning to select data for transfer learning with Bayesian Optimization

Author: Plank Barbara
Ruder Sebastian
Publication venue
Publication date: 01/01/2017
Field of study

Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks. Inspired by work on curriculum learning, we propose to \emph{learn} data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks. Our learned measures outperform existing domain similarity measures significantly on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We show the importance of complementing similarity with diversity, and that learned measures are -- to some degree -- transferable across models, domains, and even tasks.Comment: EMNLP 2017. Code available at: https://github.com/sebastianruder/learn-to-select-dat

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

$k$ -MLE: A fast algorithm for learning statistical mixture models

Author: Nielsen Frank
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/03/2012
Field of study

We describe

k

-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering

k

-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of

k

-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of

k

-MLE can be implemented using any

k

-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of

k

-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize

k

-MLE, we propose

k

-MLE++, a careful initialization of

k

-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

arXiv.org e-Print Archive

Crossref

Metric for attractor overlap

Author: Albers Marian
Fernex Daniel
Ishar Rishabh
Kaiser Eurika
Meysonnat Pascal S.
Morzynski Marek
Noack Bernd R.
Schröder Wolfgang
Semaan Richard
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 17/08/2018
Field of study

We present the first general metric for attractor overlap (MAO) facilitating an unsupervised comparison of flow data sets. The starting point is two or more attractors, i.e., ensembles of states representing different operating conditions. The proposed metric generalizes the standard Hilbert-space distance between two snapshots to snapshot ensembles of two attractors. A reduced-order analysis for big data and many attractors is enabled by coarse-graining the snapshots into representative clusters with corresponding centroids and population probabilities. For a large number of attractors, MAO is augmented by proximity maps for the snapshots, the centroids, and the attractors, giving scientifically interpretable visual access to the closeness of the states. The coherent structures belonging to the overlap and disjoint states between these attractors are distilled by few representative centroids. We employ MAO for two quite different actuated flow configurations: (1) a two-dimensional wake of the fluidic pinball with vortices in a narrow frequency range and (2) three-dimensional wall turbulence with broadband frequency spectrum manipulated by spanwise traveling transversal surface waves. MAO compares and classifies these actuated flows in agreement with physical intuition. For instance, the first feature coordinate of the attractor proximity map correlates with drag for the fluidic pinball and for the turbulent boundary layer. MAO has a large spectrum of potential applications ranging from a quantitative comparison between numerical simulations and experimental particle-image velocimetry data to the analysis of simulations representing a myriad of different operating conditions.Comment: 33 pages, 20 figure

arXiv.org e-Print Archive

Publikationsserver der RWTH Aachen University

Juelich Shared Electronic Resources