533 research outputs found

    Cooperative Online Learning: Keeping your Neighbors Updated

    Full text link
    We study an asynchronous online learning setting with a network of agents. At each time step, some of the agents are activated, requested to make a prediction, and pay the corresponding loss. The loss function is then revealed to these agents and also to their neighbors in the network. Our results characterize how much knowing the network structure affects the regret as a function of the model of agent activations. When activations are stochastic, the optimal regret (up to constant factors) is shown to be of order αT\sqrt{\alpha T}, where TT is the horizon and α\alpha is the independence number of the network. We prove that the upper bound is achieved even when agents have no information about the network structure. When activations are adversarial the situation changes dramatically: if agents ignore the network structure, a Ω(T)\Omega(T) lower bound on the regret can be proven, showing that learning is impossible. However, when agents can choose to ignore some of their neighbors based on the knowledge of the network structure, we prove a O(χ‾T)O(\sqrt{\overline{\chi} T}) sublinear regret bound, where χ‾≥α\overline{\chi} \ge \alpha is the clique-covering number of the network

    Online Game with Time-Varying Coupled Inequality Constraints

    Full text link
    In this paper, online game is studied, where at each time, a group of players aim at selfishly minimizing their own time-varying cost function simultaneously subject to time-varying coupled constraints and local feasible set constraints. Only local cost functions and local constraints are available to individual players, who can share limited information with their neighbors through a fixed and connected graph. In addition, players have no prior knowledge of future cost functions and future local constraint functions. In this setting, a novel decentralized online learning algorithm is devised based on mirror descent and a primal-dual strategy. The proposed algorithm can achieve sublinearly bounded regrets and constraint violation by appropriately choosing decaying stepsizes. Furthermore, it is shown that the generated sequence of play by the designed algorithm can converge to the variational GNE of a strongly monotone game, to which the online game converges. Additionally, a payoff-based case, i.e., in a bandit feedback setting, is also considered and a new payoff-based learning policy is devised to generate sublinear regrets and constraint violation. Finally, the obtained theoretical results are corroborated by numerical simulations.Comment: arXiv admin note: text overlap with arXiv:2105.0620

    Distributed Online Optimization with Coupled Inequality Constraints over Unbalanced Directed Networks

    Full text link
    This paper studies a distributed online convex optimization problem, where agents in an unbalanced network cooperatively minimize the sum of their time-varying local cost functions subject to a coupled inequality constraint. To solve this problem, we propose a distributed dual subgradient tracking algorithm, called DUST, which attempts to optimize a dual objective by means of tracking the primal constraint violations and integrating dual subgradient and push sum techniques. Different from most existing works, we allow the underlying network to be unbalanced with a column stochastic mixing matrix. We show that DUST achieves sublinear dynamic regret and constraint violations, provided that the accumulated variation of the optimal sequence grows sublinearly. If the standard Slater's condition is additionally imposed, DUST acquires a smaller constraint violation bound than the alternative existing methods applicable to unbalanced networks. Simulations on a plug-in electric vehicle charging problem demonstrate the superior convergence of DUST
    • …
    corecore