6,928 research outputs found
Using Monte Carlo Search With Data Aggregation to Improve Robot Soccer Policies
RoboCup soccer competitions are considered among the most challenging
multi-robot adversarial environments, due to their high dynamism and the
partial observability of the environment. In this paper we introduce a method
based on a combination of Monte Carlo search and data aggregation (MCSDA) to
adapt discrete-action soccer policies for a defender robot to the strategy of
the opponent team. By exploiting a simple representation of the domain, a
supervised learning algorithm is trained over an initial collection of data
consisting of several simulations of human expert policies. Monte Carlo policy
rollouts are then generated and aggregated to previous data to improve the
learned policy over multiple epochs and games. The proposed approach has been
extensively tested both on a soccer-dedicated simulator and on real robots.
Using this method, our learning robot soccer team achieves an improvement in
ball interceptions, as well as a reduction in the number of opponents' goals.
Together with a better performance, an overall more efficient positioning of
the whole team within the field is achieved
Learning and Reasoning for Robot Sequential Decision Making under Uncertainty
Robots frequently face complex tasks that require more than one action, where
sequential decision-making (SDM) capabilities become necessary. The key
contribution of this work is a robot SDM framework, called LCORPP, that
supports the simultaneous capabilities of supervised learning for passive state
estimation, automated reasoning with declarative human knowledge, and planning
under uncertainty toward achieving long-term goals. In particular, we use a
hybrid reasoning paradigm to refine the state estimator, and provide
informative priors for the probabilistic planner. In experiments, a mobile
robot is tasked with estimating human intentions using their motion
trajectories, declarative contextual knowledge, and human-robot interaction
(dialog-based and motion-based). Results suggest that, in efficiency and
accuracy, our framework performs better than its no-learning and no-reasoning
counterparts in office environment.Comment: In proceedings of 34th AAAI conference on Artificial Intelligence,
202
- …