178 research outputs found

    Adaptive action supervision in reinforcement learning from real-world multi-agent demonstrations

    Full text link
    Modeling of real-world biological multi-agents is a fundamental problem in various scientific and engineering fields. Reinforcement learning (RL) is a powerful framework to generate flexible and diverse behaviors in cyberspace; however, when modeling real-world biological multi-agents, there is a domain gap between behaviors in the source (i.e., real-world data) and the target (i.e., cyberspace for RL), and the source environment parameters are usually unknown. In this paper, we propose a method for adaptive action supervision in RL from real-world demonstrations in multi-agent scenarios. We adopt an approach that combines RL and supervised learning by selecting actions of demonstrations in RL based on the minimum distance of dynamic time warping for utilizing the information of the unknown source dynamics. This approach can be easily applied to many existing neural network architectures and provide us with an RL model balanced between reproducibility as imitation and generalization ability to obtain rewards in cyberspace. In the experiments, using chase-and-escape and football tasks with the different dynamics between the unknown source and target environments, we show that our approach achieved a balance between the reproducibility and the generalization ability compared with the baselines. In particular, we used the tracking data of professional football players as expert demonstrations in football and show successful performances despite the larger gap between behaviors in the source and target environments than the chase-and-escape task.Comment: 14 pages, 5 figure

    Expectations and expertise in artificial intelligence: specialist views and historical perspectives on conceptualisation, promise, and funding

    Get PDF
    Artificial intelligence’s (AI) distinctiveness as a technoscientific field that imitates the ability to think went through a resurgence of interest post-2010, attracting a flood of scientific and popular expectations as to its utopian or dystopian transformative consequences. This thesis offers observations about the formation and dynamics of expectations based on documentary material from the previous periods of perceived AI hype (1960-1975 and 1980-1990, including in-between periods of perceived dormancy), and 25 interviews with UK-based AI specialists, directly involved with its development, who commented on the issues during the crucial period of uncertainty (2017-2019) and intense negotiation through which AI gained momentum prior to its regulation and relatively stabilised new rounds of long-term investment (2020-2021). This examination applies and contributes to longitudinal studies in the sociology of expectations (SoE) and studies of experience and expertise (SEE) frameworks, proposing a historical sociology of expertise and expectations framework. The research questions, focusing on the interplay between hype mobilisation and governance, are: (1) What is the relationship between AI practical development and the broader expectational environment, in terms of funding and conceptualisation of AI? (2) To what extent does informal and non-developer assessment of expectations influence formal articulations of foresight? (3) What can historical examinations of AI’s conceptual and promissory settings tell about the current rebranding of AI? The following contributions are made: (1) I extend SEE by paying greater attention to the interplay between technoscientific experts and wider collective arenas of discourse amongst non-specialists and showing how AI’s contemporary research cultures are overwhelmingly influenced by the hype environment but also contribute to it. This further highlights the interaction between competing rationales focusing on exploratory, curiosity-driven scientific research against exploitation-oriented strategies at formal and informal levels. (2) I suggest benefits of examining promissory environments in AI and related technoscientific fields longitudinally, treating contemporary expectations as historical products of sociotechnical trajectories through an authoritative historical reading of AI’s shifting conceptualisation and attached expectations as a response to availability of funding and broader national imaginaries. This comes with the benefit of better perceiving technological hype as migrating from social group to social group instead of fading through reductionist cycles of disillusionment; either by rebranding of technical operations, or by the investigation of a given field by non-technical practitioners. It also sensitises to critically examine broader social expectations as factors for shifts in perception about theoretical/basic science research transforming into applied technological fields. Finally, (3) I offer a model for understanding the significance of interplay between conceptualisations, promising, and motivations across groups within competing dynamics of collective and individual expectations and diverse sources of expertise

    How to Make Agents and Influence Teammates: Understanding the Social Influence AI Teammates Have in Human-AI Teams

    Get PDF
    The introduction of computational systems in the last few decades has enabled humans to cross geographical, cultural, and even societal boundaries. Whether it was the invention of telephones or file sharing, new technologies have enabled humans to continuously work better together. Artificial Intelligence (AI) has one of the highest levels of potential as one of these technologies. Although AI has a multitude of functions within teaming, such as improving information sciences and analysis, one specific application of AI that has become a critical topic in recent years is the creation of AI systems that act as teammates alongside humans, in what is known as a human-AI team. However, as AI transitions into teammate roles they will garner new responsibilities and abilities, which ultimately gives them a greater influence over teams\u27 shared goals and resources, otherwise known as teaming influence. Moreover, that increase in teaming influence will provide AI teammates with a level of social influence. Unfortunately, while research has observed the impact of teaming influence by examining humans\u27 perception and performance, an explicit and literal understanding of the social influence that facilitates long-term teaming change has yet to be created. This dissertation uses three studies to create a holistic understanding of the underlying social influence that AI teammates possess. Study 1 identifies the fundamental existence of AI teammate social influence and how it pertains to teaming influence. Qualitative data demonstrates that social influence is naturally created as humans actively adapt around AI teammate teaming influence. Furthermore, mixed-methods results demonstrate that the alignment of AI teammate teaming influence with a human\u27s individual motives is the most critical factor in the acceptance of AI teammate teaming influence in existing teams. Study 2 further examines the acceptance of AI teammate teaming and social influence and how the design of AI teammates and humans\u27 individual differences can impact this acceptance. The findings of Study 2 show that humans have the greatest levels of acceptance of AI teammate teaming influence that is comparative to their own teaming influence on a single task, but the acceptance of AI teammate teaming influence across multiple tasks generally decreases as teaming influence increases. Additionally, coworker endorsements are shown to increase the acceptance of high levels of AI teammate teaming influence, and humans that perceive the capabilities of technology, in general, to be greater are potentially more likely to accept AI teammate teaming influence. Finally, Study 3 explores how the teaming and social influence possessed by AI teammates change when presented in a team that also contains teaming influence from multiple human teammates, which means social influence between humans also exists. Results demonstrate that AI teammate social influence can drive humans to prefer and observe their human teammates over their AI teammates, but humans\u27 behavioral adaptations are more centered around their AI teammates than their human teammates. These effects demonstrate that AI teammate social influence, when in the presence of human-human teaming and social influence, retains potency, but its effects are different when impacting either perception or behavior. The above three studies fill a currently under-served research gap in human-AI teaming, which is both the understanding of AI teammate social influence and humans\u27 acceptance of it. In addition, each study conducted within this dissertation synthesizes its findings and contributions into actionable design recommendations that will serve as foundational design principles to allow the initial acceptance of AI teammates within society. Therefore, not only will the research community benefit from the results discussed throughout this dissertation, but so too will the developers, designers, and human teammates of human-AI teams

    The Impact of Teams in Multiagent Systems

    Get PDF
    Across many domains, the ability to work in teams can magnify a group's abilities beyond the capabilities of any individual. While the science of teamwork is typically studied in organizational psychology (OP) and areas of biology, understanding how multiple agents can work together is an important topic in artificial intelligence (AI) and multiagent systems (MAS). Teams in AI have taken many forms, including ad hoc teamwork [Stone et al., 2010], hierarchical structures of rule-based agents [Tambe, 1997], and teams of multiagent reinforcement learning (MARL) agents [Baker et al., 2020]. Despite significant evidence in the natural world about the impact of family structure on child development and health [Lee et al., 2015; Umberson et al., 2020], the impact of team structure on the policies that individual learning agents develop is not often explicitly studied. In this thesis, we hypothesize that teams can provide significant advantages in guiding the development of policies for individual agents that learn from experience. We focus on mixed-motive domains, where long-term global welfare is maximized through global cooperation. We present a model of multiagent teams with individual learning agents inspired by OP and early work using teams in AI, and introduce credo, a model that defines how agents optimize their behavior for the goals of various groups they belong to: themselves (a group of one), any teams they belong to, and the entire system. We find that teams help agents develop cooperative policies with agents in other teams despite game-theoretic incentives to defect in various settings that are robust to some amount of selfishness. While previous work assumed that a fully cooperative population (all agents share rewards) obtain the best possible performance in mixed-motive domains [Yang et al., 2020; Gemp et al., 2020], we show that there exist multiple configurations of team structures and credo parameters that achieve about 33% more reward than the fully cooperative system. Agents in these scenarios learn more effective joint policies while maintaining high reward equality. Inspired by these results, we derive theoretical underpinnings that characterize settings where teammates may be beneficial, or not beneficial, for learning. We also propose a preliminary credo-regulating agent architecture to autonomously discover favorable learning conditions in challenging settings

    From walking to running: robust and 3D humanoid gait generation via MPC

    Get PDF
    Humanoid robots are platforms that can succeed in tasks conceived for humans. From locomotion in unstructured environments, to driving cars, or working in industrial plants, these robots have a potential that is yet to be disclosed in systematic every-day-life applications. Such a perspective, however, is opposed by the need of solving complex engineering problems under the hardware and software point of view. In this thesis, we focus on the software side of the problem, and in particular on locomotion control. The operativity of a legged humanoid is subordinate to its capability of realizing a reliable locomotion. In many settings, perturbations may undermine the balance and make the robot fall. Moreover, complex and dynamic motions might be required by the context, as for instance it could be needed to start running or climbing stairs to achieve a certain location in the shortest time. We present gait generation schemes based on Model Predictive Control (MPC) that tackle both the problem of robustness and tridimensional dynamic motions. The proposed control schemes adopt the typical paradigm of centroidal MPC for reference motion generation, enforcing dynamic balance through the Zero Moment Point condition, plus a whole-body controller that maps the generated trajectories to joint commands. Each of the described predictive controllers also feature a so-called stability constraint, preventing the generation of diverging Center of Mass trajectories with respect to the Zero Moment Point. Robustness is addressed by modeling the humanoid as a Linear Inverted Pendulum and devising two types of strategies. For persistent perturbations, a way to use a disturbance observer and a technique for constraint tightening (to ensure robust constraint satisfaction) are presented. In the case of impulsive pushes instead, techniques for footstep and timing adaptation are introduced. The underlying approach is to interpret robustness as a MPC feasibility problem, thus aiming at ensuring the existence of a solution for the constrained optimization problem to be solved at each iteration in spite of the perturbations. This perspective allows to devise simple solutions to complex problems, favoring a reliable real-time implementation. For the tridimensional locomotion, on the other hand, the humanoid is modeled as a Variable Height Inverted Pendulum. Based on it, a two stage MPC is introduced with particular emphasis on the implementation of the stability constraint. The overall result is a gait generation scheme that allows the robot to overcome relatively complex environments constituted by a non-flat terrain, with also the capability of realizing running gaits. The proposed methods are validated in different settings: from conceptual simulations in Matlab to validations in the DART dynamic environment, up to experimental tests on the NAO and the OP3 platforms

    Automation and Control

    Get PDF
    Advances in automation and control today cover many areas of technology where human input is minimized. This book discusses numerous types and applications of automation and control. Chapters address topics such as building information modeling (BIM)–based automated code compliance checking (ACCC), control algorithms useful for military operations and video games, rescue competitions using unmanned aerial-ground robots, and stochastic control systems

    System Architectures for Cooperative Teams of Unmanned Aerial Vehicles Interacting Physically with the Environment

    Get PDF
    Unmanned Aerial Vehicles (UAVs) have become quite a useful tool for a wide range of applications, from inspection & maintenance to search & rescue, among others. The capabilities of a single UAV can be extended or complemented by the deployment of more UAVs, so multi-UAV cooperative teams are becoming a trend. In that case, as di erent autopilots, heterogeneous platforms, and application-dependent software components have to be integrated, multi-UAV system architectures that are fexible and can adapt to the team's needs are required. In this thesis, we develop system architectures for cooperative teams of UAVs, paying special attention to applications that require physical interaction with the environment, which is typically unstructured. First, we implement some layers to abstract the high-level components from the hardware speci cs. Then we propose increasingly advanced architectures, from a single-UAV hierarchical navigation architecture to an architecture for a cooperative team of heterogeneous UAVs. All this work has been thoroughly tested in both simulation and eld experiments in di erent challenging scenarios through research projects and robotics competitions. Most of the applications required physical interaction with the environment, mainly in unstructured outdoors scenarios. All the know-how and lessons learned throughout the process are shared in this thesis, and all relevant code is publicly available.Los vehículos aéreos no tripulados (UAVs, del inglés Unmanned Aerial Vehicles) se han convertido en herramientas muy valiosas para un amplio espectro de aplicaciones, como inspección y mantenimiento, u operaciones de rescate, entre otras. Las capacidades de un único UAV pueden verse extendidas o complementadas al utilizar varios de estos vehículos simultáneamente, por lo que la tendencia actual es el uso de equipos cooperativos con múltiples UAVs. Para ello, es fundamental la integración de diferentes autopilotos, plataformas heterogéneas, y componentes software -que dependen de la aplicación-, por lo que se requieren arquitecturas multi-UAV que sean flexibles y adaptables a las necesidades del equipo. En esta tesis, se desarrollan arquitecturas para equipos cooperativos de UAVs, prestando una especial atención a aplicaciones que requieran de interacción física con el entorno, cuya naturaleza es típicamente no estructurada. Primero se proponen capas para abstraer a los componentes de alto nivel de las particularidades del hardware. Luego se desarrollan arquitecturas cada vez más avanzadas, desde una arquitectura de navegación para un único UAV, hasta una para un equipo cooperativo de UAVs heterogéneos. Todo el trabajo ha sido minuciosamente probado, tanto en simulación como en experimentos reales, en diferentes y complejos escenarios motivados por proyectos de investigación y competiciones de robótica. En la mayoría de las aplicaciones se requería de interacción física con el entorno, que es normalmente un escenario en exteriores no estructurado. A lo largo de la tesis, se comparten todo el conocimiento adquirido y las lecciones aprendidas en el proceso, y el código relevante está publicado como open-source
    corecore