48 research outputs found

    A Principled Approach for Learning Task Similarity in Multitask Learning

    Full text link
    Multitask learning aims at solving a set of related tasks simultaneously, by exploiting the shared knowledge for improving the performance on individual tasks. Hence, an important aspect of multitask learning is to understand the similarities within a set of tasks. Previous works have incorporated this similarity information explicitly (e.g., weighted loss for each task) or implicitly (e.g., adversarial loss for feature adaptation), for achieving good empirical performances. However, the theoretical motivations for adding task similarity knowledge are often missing or incomplete. In this paper, we give a different perspective from a theoretical point of view to understand this practice. We first provide an upper bound on the generalization error of multitask learning, showing the benefit of explicit and implicit task similarity knowledge. We systematically derive the bounds based on two distinct task similarity metrics: H divergence and Wasserstein distance. From these theoretical results, we revisit the Adversarial Multi-task Neural Network, proposing a new training algorithm to learn the task relation coefficients and neural network parameters iteratively. We assess our new algorithm empirically on several benchmarks, showing not only that we find interesting and robust task relations, but that the proposed approach outperforms the baselines, reaffirming the benefits of theoretical insight in algorithm design

    Parameter estimation for many-particle models from aggregate observations: A Wasserstein distance based sequential Monte Carlo sampler

    Full text link
    In this work we study systems consisting of a group of moving particles. In such systems, often some important parameters are unknown and have to be estimated from observed data. Such parameter estimation problems can often be solved via a Bayesian inference framework. However in many practical problems, only data at the aggregate level is available and as a result the likelihood function is not available, which poses challenge for Bayesian methods. In particular, we consider the situation where the distributions of the particles are observed. We propose a Wasserstein distance based sequential Monte Carlo sampler to solve the problem: the Wasserstein distance is used to measure the similarity between the observed and the simulated particle distributions and the sequential Monte Carlo samplers is used to deal with the sequentially available observations. Two real-world examples are provided to demonstrate the performance of the proposed method
    corecore