4,017 research outputs found

    Learning to select data for transfer learning with Bayesian Optimization

    Full text link
    Domain similarity measures can be used to gauge adaptability and select suitable data for transfer learning, but existing approaches define ad hoc measures that are deemed suitable for respective tasks. Inspired by work on curriculum learning, we propose to \emph{learn} data selection measures using Bayesian Optimization and evaluate them across models, domains and tasks. Our learned measures outperform existing domain similarity measures significantly on three tasks: sentiment analysis, part-of-speech tagging, and parsing. We show the importance of complementing similarity with diversity, and that learned measures are -- to some degree -- transferable across models, domains, and even tasks.Comment: EMNLP 2017. Code available at: https://github.com/sebastianruder/learn-to-select-dat

    kk-MLE: A fast algorithm for learning statistical mixture models

    Full text link
    We describe kk-MLE, a fast and efficient local search algorithm for learning finite statistical mixtures of exponential families such as Gaussian mixture models. Mixture models are traditionally learned using the expectation-maximization (EM) soft clustering technique that monotonically increases the incomplete (expected complete) likelihood. Given prescribed mixture weights, the hard clustering kk-MLE algorithm iteratively assigns data to the most likely weighted component and update the component models using Maximum Likelihood Estimators (MLEs). Using the duality between exponential families and Bregman divergences, we prove that the local convergence of the complete likelihood of kk-MLE follows directly from the convergence of a dual additively weighted Bregman hard clustering. The inner loop of kk-MLE can be implemented using any kk-means heuristic like the celebrated Lloyd's batched or Hartigan's greedy swap updates. We then show how to update the mixture weights by minimizing a cross-entropy criterion that implies to update weights by taking the relative proportion of cluster points, and reiterate the mixture parameter update and mixture weight update processes until convergence. Hard EM is interpreted as a special case of kk-MLE when both the component update and the weight update are performed successively in the inner loop. To initialize kk-MLE, we propose kk-MLE++, a careful initialization of kk-MLE guaranteeing probabilistically a global bound on the best possible complete likelihood.Comment: 31 pages, Extend preliminary paper presented at IEEE ICASSP 201

    Metric for attractor overlap

    Full text link
    We present the first general metric for attractor overlap (MAO) facilitating an unsupervised comparison of flow data sets. The starting point is two or more attractors, i.e., ensembles of states representing different operating conditions. The proposed metric generalizes the standard Hilbert-space distance between two snapshots to snapshot ensembles of two attractors. A reduced-order analysis for big data and many attractors is enabled by coarse-graining the snapshots into representative clusters with corresponding centroids and population probabilities. For a large number of attractors, MAO is augmented by proximity maps for the snapshots, the centroids, and the attractors, giving scientifically interpretable visual access to the closeness of the states. The coherent structures belonging to the overlap and disjoint states between these attractors are distilled by few representative centroids. We employ MAO for two quite different actuated flow configurations: (1) a two-dimensional wake of the fluidic pinball with vortices in a narrow frequency range and (2) three-dimensional wall turbulence with broadband frequency spectrum manipulated by spanwise traveling transversal surface waves. MAO compares and classifies these actuated flows in agreement with physical intuition. For instance, the first feature coordinate of the attractor proximity map correlates with drag for the fluidic pinball and for the turbulent boundary layer. MAO has a large spectrum of potential applications ranging from a quantitative comparison between numerical simulations and experimental particle-image velocimetry data to the analysis of simulations representing a myriad of different operating conditions.Comment: 33 pages, 20 figure
    corecore