It is expected that autonomous vehicles(AVs) and heterogeneous human-driven
vehicles(HVs) will coexist on the same road. The safety and reliability of AVs
will depend on their social awareness and their ability to engage in complex
social interactions in a socially accepted manner. However, AVs are still
inefficient in terms of cooperating with HVs and struggle to understand and
adapt to human behavior, which is particularly challenging in mixed autonomy.
In a road shared by AVs and HVs, the social preferences or individual traits of
HVs are unknown to the AVs and different from AVs, which are expected to follow
a policy, HVs are particularly difficult to forecast since they do not
necessarily follow a stationary policy. To address these challenges, we frame
the mixed-autonomy problem as a multi-agent reinforcement learning (MARL)
problem and propose an approach that allows AVs to learn the decision-making of
HVs implicitly from experience, account for all vehicles' interests, and safely
adapt to other traffic situations. In contrast with existing works, we quantify
AVs' social preferences and propose a distributed reward structure that
introduces altruism into their decision-making process, allowing the altruistic
AVs to learn to establish coalitions and influence the behavior of HVs.Comment: arXiv admin note: substantial text overlap with arXiv:2202.0088