225 research outputs found
A Parameterisation of Algorithms for Distributed Constraint Optimisation via Potential Games
This paper introduces a parameterisation of learning algorithms for distributed constraint optimisation problems (DCOPs). This parameterisation encompasses many algorithms developed in both the computer science and game theory literatures. It is built on our insight that when formulated as noncooperative games, DCOPs form a subset of the class of potential games. This result allows us to prove convergence properties of algorithms developed in the computer science literature using game theoretic methods. Furthermore, our parameterisation can assist system designers by making the pros and cons of, and the synergies between, the various DCOP algorithm components clear
A Distributed Algorithm for Demand Response with Mixed-Integer Variables
This letter presents a fast distributed algorithm for aggregating a large
number of households with mixed-integer variables and intricate couplings
between devices. The proposed fast distributed gradient algorithm is applied to
the double smoothed dual function of the adopted DR model. The results also
show that, with minimal parameter adjustments, the convergence of the dual
objective exhibits the same behavior irrespective of the system size.Comment: 2 pages, 1 figure, to be published in IEEE Transactions on Smart Grid
Letter
Knapsack based Optimal Policies for Budget-Limited Multi-Armed Bandits
In budget-limited multi-armed bandit (MAB) problems, the learner's actions
are costly and constrained by a fixed budget. Consequently, an optimal
exploitation policy may not be to pull the optimal arm repeatedly, as is the
case in other variants of MAB, but rather to pull the sequence of different
arms that maximises the agent's total reward within the budget. This difference
from existing MABs means that new approaches to maximising the total reward are
required. Given this, we develop two pulling policies, namely: (i) KUBE; and
(ii) fractional KUBE. Whereas the former provides better performance up to 40%
in our experimental settings, the latter is computationally less expensive. We
also prove logarithmic upper bounds for the regret of both policies, and show
that these bounds are asymptotically optimal (i.e. they only differ from the
best possible regret by a constant factor)
Learn While You Earn: Two Approaches to Learning Auction Parameters in Take-it-or-leave-it Auctions
Much of the research in auction theory assumes that the auctioneer knows the distribution of participants ’ valuations with complete certainty. However, this is unrealistic. Thus, we analyse cases in which the auctioneer is uncertain about the valuation distributions; specifically, we consider a repeated auction setting in which the auctioneer can learn these distributions. Using take-it-or-leave-it auctions (Sandholm and Gilpin, 2006) as an exemplar auction format, we consider two auction design criteria. Firstly, an auctioneer could maximise expected revenue each time the auction is held. Secondly, an auctioneer could maximise the information gained in earlier auctions (as measured by the Kullback-Liebler divergence between its posterior and prior) to develop good estimates of the unknowns, which are later exploited to improve the revenue earned in the long-run. Simulation results comparing the two criteria indicate that setting offers to maximise revenue does not significantly detract from learning performance, but optimising offers for information gain substantially reduces expected revenue while not producing significantly better parameter estimates
Decentralised Dynamic Task Allocation Using Overlapping Potential Games
This paper reports on a novel decentralised technique for planning agent schedules in dynamic task allocation problems. Specifically, we use a stochastic game formulation of these problems in which tasks have varying hard deadlines and processing requirements. We then introduce a new technique for approximating this game using a series of static potential games, before detailing a decentralised method for solving the approximating games that uses the distributed stochastic algorithm. Finally, we discuss an implementation of our approach to a task allocation problem in the RoboCup Rescue disaster management simulator. The results show that our technique performs comparably to a centralised task scheduler (within 6% on average), and also, unlike its centralised counterpart, it is robust to restrictions on the agents’ communication and observation ranges
- …
