428,737 research outputs found

    Learning Vector Quantization:Generalization ability and dynamics of competing prototypes

    Get PDF
    Learning Vector Quantization (LVQ) are popular multi-class classification algorithms. Prototypes in an LVQ system represent the typical features of classes in the data. Frequently multiple prototypes are employed for a class to improve the representation of variations within the class and the generalization ability. In this paper, we investigate the dynamics of LVQ in an exact mathematical way, aiming at understanding the influence of the number of prototypes and their assignment to classes. The theory of on-line learning allows a mathematical description of the learning dynamics in model situations. We demonstrate using a system of three prototypes the different behaviors of LVQ systems of multiple prototype and single prototype class representation.

    Multi-Parametric Extremum Seeking-based Auto-Tuning for Robust Input-Output Linearization Control

    Full text link
    We study in this paper the problem of iterative feedback gains tuning for a class of nonlinear systems. We consider Input-Output linearizable nonlinear systems with additive uncertainties. We first design a nominal Input-Output linearization-based controller that ensures global uniform boundedness of the output tracking error dynamics. Then, we complement the robust controller with a model-free multi-parametric extremum seeking (MES) control to iteratively auto-tune the feedback gains. We analyze the stability of the whole controller, i.e. robust nonlinear controller plus model-free learning algorithm. We use numerical tests to demonstrate the performance of this method on a mechatronics example.Comment: To appear at the IEEE CDC 201

    The Stabilisation of Equilibria in Evolutionary Game Dynamics through Mutation: Mutation Limits in Evolutionary Games

    Get PDF
    The multi-population replicator dynamics is a dynamic approach to coevolving populations and multi-player games and is related to Cross learning. In general, not every equilibrium is a Nash equilibrium of the underlying game, and the convergence is not guaranteed. In particular, no interior equilibrium can be asymptotically stable in the multi-population replicator dynamics, e.g. resulting in cyclic orbits around a single interior Nash equilibrium. We introduce a new notion of equilibria of replicator dynamics, called mutation limits, based on a naturally arising, simple form of mutation, which is invariant under the specific choice of mutation parameters. We prove the existence of mutation limits for a large class of games, and consider a particularly interesting subclass called attracting mutation limits. Attracting mutation limits are approximated in every (mutation-)perturbed replicator dynamics, hence they offer an approximate dynamic solution to the underlying game even if the original dynamic is not convergent. Thus, mutation stabilizes the system in certain cases and makes attracting mutation limits near attainable. Hence, attracting mutation limits are relevant as a dynamic solution concept of games. We observe that they have some similarity to Q-learning in multi-agent reinforcement learning. Attracting mutation limits do not exist in all games, however, raising the question of their characterization

    Optimal Job Design and Career Dynamics in the Presence of Uncertainty

    Get PDF
    The paper studies a learning model in which information about a worker's ability can be acquired symmetrically by the worker and a firm in any period by observing the worker's performance on a given task. Productivity at different tasks is assumed to be differentially sensitive to a worker's intrinsic talent: potentially more profitable tasks entail the risk of greater output destruction if the worker assigned to them is not of the ability required. We characterize the (essentially unique) optimal retention, task assignment and promotion policy for the class of sequential equilibria of this game, by showing that the equilibria of interest are strategically equivalent to the solution of an experimentation problem (a discounted multi-armed bandit with independent and dependent arms). These equilibria are all ex ante efficient but involve ex post inefficient task allocation and separation. While the ex post inefficiency of separations persists even as the time horizon becomes arbitrarily large, in the limit task assignment is efficient. When ability consists of multiple skills, low performing promoted workers are fired rather than demoted, if outcomes at lower level tasks, compared to those at higher level tasks, provide a sufficiently accurate measure of ability. We then examine the strategic effects of the dynamics of learning on a worker's career profile. We prove, in particular, that price competition among firms causes ex ante inefficient turnover and task assignment, independently of the degree of transferability of human capital. In a class of equilibria of interest it generates a wage dynamics consistent with properties observed in the dataLearning, Job Assignment, Experimentation, Correlated Multi-armed Bandit

    On the robustness of learning in games with stochastically perturbed payoff observations

    Get PDF
    Motivated by the scarcity of accurate payoff feedback in practical applications of game theory, we examine a class of learning dynamics where players adjust their choices based on past payoff observations that are subject to noise and random disturbances. First, in the single-player case (corresponding to an agent trying to adapt to an arbitrarily changing environment), we show that the stochastic dynamics under study lead to no regret almost surely, irrespective of the noise level in the player's observations. In the multi-player case, we find that dominated strategies become extinct and we show that strict Nash equilibria are stochastically stable and attracting; conversely, if a state is stable or attracting with positive probability, then it is a Nash equilibrium. Finally, we provide an averaging principle for 2-player games, and we show that in zero-sum games with an interior equilibrium, time averages converge to Nash equilibrium for any noise level.Comment: 36 pages, 4 figure

    A Unifying Perspective on Multi-Calibration: Game Dynamics for Multi-Objective Learning

    Full text link
    We provide a unifying framework for the design and analysis of multicalibrated predictors. By placing the multicalibration problem in the general setting of multi-objective learning -- where learning guarantees must hold simultaneously over a set of distributions and loss functions -- we exploit connections to game dynamics to achieve state-of-the-art guarantees for a diverse set of multicalibration learning problems. In addition to shedding light on existing multicalibration guarantees and greatly simplifying their analysis, our approach also yields improved guarantees, such as obtaining stronger multicalibration conditions that scale with the square-root of group size and improving the complexity of kk-class multicalibration by an exponential factor of kk. Beyond multicalibration, we use these game dynamics to address emerging considerations in the study of group fairness and multi-distribution learning.Comment: 45 pages. Authors are ordered alphabeticall

    Decentralized Multi-Agent Reinforcement Learning for Continuous-Space Stochastic Games

    Full text link
    Stochastic games are a popular framework for studying multi-agent reinforcement learning (MARL). Recent advances in MARL have focused primarily on games with finitely many states. In this work, we study multi-agent learning in stochastic games with general state spaces and an information structure in which agents do not observe each other's actions. In this context, we propose a decentralized MARL algorithm and we prove the near-optimality of its policy updates. Furthermore, we study the global policy-updating dynamics for a general class of best-reply based algorithms and derive a closed-form characterization of convergence probabilities over the joint policy space
    • 

    corecore