3,433 research outputs found
Entropy based independent learning in anonymous multi-agent settings
Efficient sequential matching of supply and demand is a problem of interest
in many online to offline services. For instance, Uber, Lyft, Grab for matching
taxis to customers; Ubereats, Deliveroo, FoodPanda etc for matching restaurants
to customers. In these online to offline service problems, individuals who are
responsible for supply (e.g., taxi drivers, delivery bikes or delivery van
drivers) earn more by being at the "right" place at the "right" time. We are
interested in developing approaches that learn to guide individuals to be in
the "right" place at the "right" time (to maximize revenue) in the presence of
other similar "learning" individuals and only local aggregated observation of
other agents states (e.g., only number of other taxis in same zone as current
agent).
A key characteristic of the domains of interest is that the interactions
between individuals are anonymous, i.e., the outcome of an interaction
(competing for demand) is dependent only on the number and not on the identity
of the agents. We model these problems using the Anonymous MARL (AyMARL) model.
The key contribution of this paper is in employing principle of maximum entropy
to provide a general framework of independent learning that is both empirically
effective (even with only local aggregated information of agent population
distribution) and theoretically justified.
Finally, our approaches provide a significant improvement with respect to
joint and individual revenue on a generic simulator for online to offline
services and a real world taxi problem over existing approaches. More
importantly, this is achieved while having the least variance in revenues
earned by the learning individuals, an indicator of fairness
Collaboration in Social Networks
The very notion of social network implies that linked individuals interact
repeatedly with each other. This allows them not only to learn successful
strategies and adapt to them, but also to condition their own behavior on the
behavior of others, in a strategic forward looking manner. Game theory of
repeated games shows that these circumstances are conducive to the emergence of
collaboration in simple games of two players. We investigate the extension of
this concept to the case where players are engaged in a local contribution game
and show that rationality and credibility of threats identify a class of Nash
equilibria -- that we call "collaborative equilibria" -- that have a precise
interpretation in terms of sub-graphs of the social network. For large network
games, the number of such equilibria is exponentially large in the number of
players. When incentives to defect are small, equilibria are supported by local
structures whereas when incentives exceed a threshold they acquire a non-local
nature, which requires a "critical mass" of more than a given fraction of the
players to collaborate. Therefore, when incentives are high, an individual
deviation typically causes the collapse of collaboration across the whole
system. At the same time, higher incentives to defect typically support
equilibria with a higher density of collaborators. The resulting picture
conforms with several results in sociology and in the experimental literature
on game theory, such as the prevalence of collaboration in denser groups and in
the structural hubs of sparse networks
- …