4 research outputs found
Innovative navigation artificial intelligence for motor racing games
A thesis submitted to the University of Bedfordshire, in fulfilment of the requirements for the degree of Masters of Science by ResearchMotor racing games are pushing the boundaries of realism and player experience. Artificial Intelligence (AI) allows developers to create believable opponents. By getting their AI to follow a racing line that is similar to that taken by real racing drivers, developers are able to create a sense that the AI racers are trained drivers.
This paper identifies two methods used in the field: the sector based system and the sensor based system. The sector based approach offers two or more predetermined lines for the AI to follow, with added logic allowing the AI to judge when to switch between lines. The sensor method is able to guide AI vehicles around tracks with sensors, offering more possible behaviours and lines. After implementation, the strengths and weaknesses of both methods are realised. The planning and development of a hybrid system was based on these findings. The
resulting system is able to produce a more believable line for the AI. With the setting up process of a race track the sector method taking a long time, exploration into tool development is conducted to reduce the process. The subsequent tool reduced the time needed to set up a track, providing results similar to the old method
TORCS Training Interface : uma ferramenta auxiliar ao desenvolvimento de pilotos do TORCS
Monografia (graduação)—Universidade de Brasília, Brasília, 2013.A ineficiente maneira como os pilotos são testados e desenvolvidos para jogo e simulador de corrida TORCS é um problema relevante por conta das limitações impostas sobre trabalhos de desenvolvimento de pilotos, i.e., algoritmos que determinam o comportamento dos carros não controlados por jogadores humanos. Porque este software tem um papel de plataforma para benchmark de diferentes abordagens de Inteligência Articial, é importante que se procure mitigar tal problema. Aqui desenvolveu-se a TORCS Training Interface, uma ferramenta que oferece automatizações para melhorar a eficiência das chamadas desimulações e retornar dados mais completos – ambos fatores importantes para as necessárias avaliações que têm como objetivo estimar habilidades de pilotos. Os resultados dos testes comparativos realizados indicam que ousoda ferramenta é uma alternativa viável às abordagens observadas na literatura, apresentando vantagens que podem torná-la a forma mais adequada para processos similares aos considerados neste trabalho. _____________________________________________________________________ ABSTRACT: The inefficient manner in which drivers are tested and developed for the racing game and simulator TORCS is a relevant problem because of the limitations imposed over projects of development of drivers, i.e., algorith ms that determine the behavior of cars that are not controlled by human players. Because this software has a role of benchmark for different techniques of Articial Intelligence, it is important to work on mitigating this problem. The TORCS Training Interface was developed, a tool that offers automatizations in order to improve the efficiency of simulation calls and return more complete data-both of which are important for the necessary evaluations that have as a goal estimating the fitness of drivers. Results of the comparative tests performed indicate that the use of the tool is a viable alternative to the approaches seen in the literature, presentin g advantages that can make it the most fitting to processes that are similar to the ones considered here
Recommended from our members
Providing Informative Feedback for Learning in Tightly Coupled Multiagent Domains
Autonomous agents that sense, decide, act, and coordinate effectively with each other are critical in many real-world domains such as autonomous driving, search and rescue missions, air traffic management, and underwater or deep space exploration. All such domains share a key difficulty: though high-level mission goals are clear to system designers, the agent behaviors that achieve those goals are not.
Thus, system designers aim to use adaptive approaches such as reinforcement learning (RL) or evolutionary algorithms (EA) to discover the ideal behaviors for the agents, and these behaviors are often implemented in computational policies (for example as artificial neural networks) that map sensory inputs to actions or values. But for such learning systems to be successful, they need to leverage a system feedback (based on the agents' collective performance) to revise and update the agents' policies for how the agents should interact with the environment.
Unfortunately, both RL and EA approaches struggle when the environmental feedback is sparse and/or uninformative, especially in multiagent domains where teasing out an agent’s contribution to the system is difficult. Reward shaping methods address some of this difficulty, but they also suffer when faced with tightly coupled multiagent domains where feedback depends on multiple agents taking the correct joint action at the appropriate time.
The contributions of this work is to introduce Reward-Shaped Curriculum Learning, Fitness Critics, and Bidirectional Fitness Critics to address the challenges of sparse feedback in tightly coupled multiagent domains.
Reward-Shaped Curriculum Learning trains agents on successively more complex scenarios, which enables agents to use reward shaping to discover the correct actions first and then coordinate for the complex tasks. The impact of this approach is "reduce the sparsity'' of the reward. Fitness Critics directly address the sparse feedback problem by replacing the system reward with a step-by-step performance metric that maps the step-wise observations and actions to meaningful evaluations that are able to identify desirable behaviors. The impact of this approach is to turn a sparse, policy-based reward into a dense, state-action-based reward that trains agents for specific behaviors. Bidirectional Fitness Critics extends Fitness Critics to provide more informative feedback by leveraging the temporal information about the reward and the relevance of that information to the task. The impact of this approach is to more accurately capture the agents' contribution to the desired behavior