48 research outputs found
A Principled Approach for Learning Task Similarity in Multitask Learning
Multitask learning aims at solving a set of related tasks simultaneously, by
exploiting the shared knowledge for improving the performance on individual
tasks. Hence, an important aspect of multitask learning is to understand the
similarities within a set of tasks. Previous works have incorporated this
similarity information explicitly (e.g., weighted loss for each task) or
implicitly (e.g., adversarial loss for feature adaptation), for achieving good
empirical performances. However, the theoretical motivations for adding task
similarity knowledge are often missing or incomplete. In this paper, we give a
different perspective from a theoretical point of view to understand this
practice. We first provide an upper bound on the generalization error of
multitask learning, showing the benefit of explicit and implicit task
similarity knowledge. We systematically derive the bounds based on two distinct
task similarity metrics: H divergence and Wasserstein distance. From these
theoretical results, we revisit the Adversarial Multi-task Neural Network,
proposing a new training algorithm to learn the task relation coefficients and
neural network parameters iteratively. We assess our new algorithm empirically
on several benchmarks, showing not only that we find interesting and robust
task relations, but that the proposed approach outperforms the baselines,
reaffirming the benefits of theoretical insight in algorithm design
Parameter estimation for many-particle models from aggregate observations: A Wasserstein distance based sequential Monte Carlo sampler
In this work we study systems consisting of a group of moving particles. In
such systems, often some important parameters are unknown and have to be
estimated from observed data. Such parameter estimation problems can often be
solved via a Bayesian inference framework. However in many practical problems,
only data at the aggregate level is available and as a result the likelihood
function is not available, which poses challenge for Bayesian methods. In
particular, we consider the situation where the distributions of the particles
are observed. We propose a Wasserstein distance based sequential Monte Carlo
sampler to solve the problem: the Wasserstein distance is used to measure the
similarity between the observed and the simulated particle distributions and
the sequential Monte Carlo samplers is used to deal with the sequentially
available observations. Two real-world examples are provided to demonstrate the
performance of the proposed method