24,776 research outputs found
Decentralized Cooperative Planning for Automated Vehicles with Continuous Monte Carlo Tree Search
Urban traffic scenarios often require a high degree of cooperation between
traffic participants to ensure safety and efficiency. Observing the behavior of
others, humans infer whether or not others are cooperating. This work aims to
extend the capabilities of automated vehicles, enabling them to cooperate
implicitly in heterogeneous environments. Continuous actions allow for
arbitrary trajectories and hence are applicable to a much wider class of
problems than existing cooperative approaches with discrete action spaces.
Based on cooperative modeling of other agents, Monte Carlo Tree Search (MCTS)
in conjunction with Decoupled-UCT evaluates the action-values of each agent in
a cooperative and decentralized way, respecting the interdependence of actions
among traffic participants. The extension to continuous action spaces is
addressed by incorporating novel MCTS-specific enhancements for efficient
search space exploration. The proposed algorithm is evaluated under different
scenarios, showing that the algorithm is able to achieve effective cooperative
planning and generate solutions egocentric planning fails to identify
Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search
Today's automated vehicles lack the ability to cooperate implicitly with
others. This work presents a Monte Carlo Tree Search (MCTS) based approach for
decentralized cooperative planning using macro-actions for automated vehicles
in heterogeneous environments. Based on cooperative modeling of other agents
and Decoupled-UCT (a variant of MCTS), the algorithm evaluates the
state-action-values of each agent in a cooperative and decentralized manner,
explicitly modeling the interdependence of actions between traffic
participants. Macro-actions allow for temporal extension over multiple time
steps and increase the effective search depth requiring fewer iterations to
plan over longer horizons. Without predefined policies for macro-actions, the
algorithm simultaneously learns policies over and within macro-actions. The
proposed method is evaluated under several conflict scenarios, showing that the
algorithm can achieve effective cooperative planning with learned macro-actions
in heterogeneous environments
Concurrent bandits and cognitive radio networks
We consider the problem of multiple users targeting the arms of a single
multi-armed stochastic bandit. The motivation for this problem comes from
cognitive radio networks, where selfish users need to coexist without any side
communication between them, implicit cooperation or common control. Even the
number of users may be unknown and can vary as users join or leave the network.
We propose an algorithm that combines an -greedy learning rule with a
collision avoidance mechanism. We analyze its regret with respect to the
system-wide optimum and show that sub-linear regret can be obtained in this
setting. Experiments show dramatic improvement compared to other algorithms for
this setting
- …