87,933 research outputs found

    Exploiting Domain Knowledge in Making Delegation Decisions

    Get PDF
    @inproceedings{conf/admi/EmeleNSP11, added-at = {2011-12-19T00:00:00.000+0100}, author = {Emele, Chukwuemeka David and Norman, Timothy J. and Sensoy, Murat and Parsons, Simon}, biburl = {http://www.bibsonomy.org/bibtex/20a08b683088443f1fd36d6ef28bf6615/dblp}, booktitle = {ADMI}, crossref = {conf/admi/2011}, editor = {Cao, Longbing and Bazzan, Ana L. C. and Symeonidis, Andreas L. and Gorodetsky, Vladimir and Weiss, Gerhard and Yu, Philip S.}, ee = {http://dx.doi.org/10.1007/978-3-642-27609-5_9}, interhash = {1d7e7f8554e8bdb3d43c32e02aeabcec}, intrahash = {0a08b683088443f1fd36d6ef28bf6615}, isbn = {978-3-642-27608-8}, keywords = {dblp}, pages = {117-131}, publisher = {Springer}, series = {Lecture Notes in Computer Science}, timestamp = {2011-12-19T00:00:00.000+0100}, title = {Exploiting Domain Knowledge in Making Delegation Decisions.}, url = {http://dblp.uni-trier.de/db/conf/admi/admi2011.html#EmeleNSP11}, volume = 7103, year = 2011 }Postprin

    Resilient Autonomous Control of Distributed Multi-agent Systems in Contested Environments

    Full text link
    An autonomous and resilient controller is proposed for leader-follower multi-agent systems under uncertainties and cyber-physical attacks. The leader is assumed non-autonomous with a nonzero control input, which allows changing the team behavior or mission in response to environmental changes. A resilient learning-based control protocol is presented to find optimal solutions to the synchronization problem in the presence of attacks and system dynamic uncertainties. An observer-based distributed H_infinity controller is first designed to prevent propagating the effects of attacks on sensors and actuators throughout the network, as well as to attenuate the effect of these attacks on the compromised agent itself. Non-homogeneous game algebraic Riccati equations are derived to solve the H_infinity optimal synchronization problem and off-policy reinforcement learning is utilized to learn their solution without requiring any knowledge of the agent's dynamics. A trust-confidence based distributed control protocol is then proposed to mitigate attacks that hijack the entire node and attacks on communication links. A confidence value is defined for each agent based solely on its local evidence. The proposed resilient reinforcement learning algorithm employs the confidence value of each agent to indicate the trustworthiness of its own information and broadcast it to its neighbors to put weights on the data they receive from it during and after learning. If the confidence value of an agent is low, it employs a trust mechanism to identify compromised agents and remove the data it receives from them from the learning process. Simulation results are provided to show the effectiveness of the proposed approach

    A Survey of Monte Carlo Tree Search Methods

    Get PDF
    Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work

    Scalable Planning and Learning for Multiagent POMDPs: Extended Version

    Get PDF
    Online, sample-based planning algorithms for POMDPs have shown great promise in scaling to problems with large state spaces, but they become intractable for large action and observation spaces. This is particularly problematic in multiagent POMDPs where the action and observation space grows exponentially with the number of agents. To combat this intractability, we propose a novel scalable approach based on sample-based planning and factored value functions that exploits structure present in many multiagent settings. This approach applies not only in the planning case, but also in the Bayesian reinforcement learning setting. Experimental results show that we are able to provide high quality solutions to large multiagent planning and learning problems

    Multi-agent evolutionary systems for the generation of complex virtual worlds

    Full text link
    Modern films, games and virtual reality applications are dependent on convincing computer graphics. Highly complex models are a requirement for the successful delivery of many scenes and environments. While workflows such as rendering, compositing and animation have been streamlined to accommodate increasing demands, modelling complex models is still a laborious task. This paper introduces the computational benefits of an Interactive Genetic Algorithm (IGA) to computer graphics modelling while compensating the effects of user fatigue, a common issue with Interactive Evolutionary Computation. An intelligent agent is used in conjunction with an IGA that offers the potential to reduce the effects of user fatigue by learning from the choices made by the human designer and directing the search accordingly. This workflow accelerates the layout and distribution of basic elements to form complex models. It captures the designer's intent through interaction, and encourages playful discovery

    ?????? ?????? ??????????????? ?????? ????????????

    Get PDF
    Department of Computer Science and EngineeringRecently deep reinforcement learning (DRL) algorithms show super human performances in the simulated game domains. In practical points, the sample efficiency is also one of the most important measures to determine the performance of a model. Especially for the environment of large search spaces (e.g. continuous action space), it is very critical condition to achieve the state-of-the-art performance. In this thesis, we design a model to be applicable to multi-end games in continuous space with high sample efficiency. A multi-end game has several sub-games which are independent each other but affect the result of the game by some rules of its domain. We verify the algorithm in the environment of simulated curling.clos

    Improving Automated Driving through Planning with Human Internal States

    Full text link
    This work examines the hypothesis that partially observable Markov decision process (POMDP) planning with human driver internal states can significantly improve both safety and efficiency in autonomous freeway driving. We evaluate this hypothesis in a simulated scenario where an autonomous car must safely perform three lane changes in rapid succession. Approximate POMDP solutions are obtained through the partially observable Monte Carlo planning with observation widening (POMCPOW) algorithm. This approach outperforms over-confident and conservative MDP baselines and matches or outperforms QMDP. Relative to the MDP baselines, POMCPOW typically cuts the rate of unsafe situations in half or increases the success rate by 50%.Comment: Preprint before submission to IEEE Transactions on Intelligent Transportation Systems. arXiv admin note: text overlap with arXiv:1702.0085
    corecore