834 research outputs found

    Knowledge Transfer with Jacobian Matching

    Full text link
    Classical distillation methods transfer representations from a "teacher" neural network to a "student" network by matching their output activations. Recent methods also match the Jacobians, or the gradient of output activations with the input. However, this involves making some ad hoc decisions, in particular, the choice of the loss function. In this paper, we first establish an equivalence between Jacobian matching and distillation with input noise, from which we derive appropriate loss functions for Jacobian matching. We then rely on this analysis to apply Jacobian matching to transfer learning by establishing equivalence of a recent transfer learning procedure to distillation. We then show experimentally on standard image datasets that Jacobian-based penalties improve distillation, robustness to noisy inputs, and transfer learning

    Evolutionary dynamics in heterogeneous populations: a general framework for an arbitrary type distribution

    Full text link
    A general framework of evolutionary dynamics under heterogeneous populations is presented. The framework allows continuously many types of heterogeneous agents, heterogeneity both in payoff functions and in revision protocols and the entire joint distribution of strategies and types to influence the payoffs of agents. We clarify regularity conditions for the unique existence of a solution trajectory and for the existence of equilibrium. We confirm that equilibrium stationarity in general and equilibrium stability in potential games are extended from the homogeneous setting to the heterogeneous setting. In particular, a wide class of admissible dynamics share the same set of locally stable equilibria in a potential game through local maximization of the potential

    Consistency of vanishing smooth fictitious play

    Full text link
    We discuss consistency of Vanishing Smooth Fictitious Play, a strategy in the context of game theory, which can be regarded as a smooth fictitious play procedure, where the smoothing parameter is time-dependent and asymptotically vanishes. This answers a question initially raised by Drew Fudenberg and Satoru Takahashi.Comment: 17 page
    corecore