Search CORE

3,602 research outputs found

Learning of Generalized Manipulation Strategies in Service Robotics

Author: Jäkel Rainer
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2013
Field of study

This thesis makes a contribution to autonomous robotic manipulation. The core is a novel constraint-based representation of manipulation tasks suitable for flexible online motion planning. Interactive learning from natural human demonstrations is combined with parallelized optimization to enable efficient learning of complex manipulation tasks with limited training data. Prior planning results are encoded automatically into the model to reduce planning time and solve the correspondence problem

KITopen

Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Author: Golen Toni
Gombolay Matthew
Jensen Reed
Shah Julie
Shah Neel
Son Sung-Hyun
Stigile Jessica
Publication venue
Publication date: 10/05/2018
Field of study

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.Comment: Portions of this paper were published in the Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper consists of 50 pages with 11 figures and 4 table

arXiv.org e-Print Archive

DSpace@MIT

Recommended from our members

End-to-end deep reinforcement learning in computer systems

Author: Schaarschmidt Michael
Publication venue: University of Cambridge
Publication date: 13/04/2020
Field of study

Abstract The growing complexity of data processing systems has long led systems designers to imagine systems (e.g. databases, schedulers) which can self-configure and adapt based on environmental cues. In this context, reinforcement learning (RL) methods have since their inception appealed to systems developers. They promise to acquire complex decision policies from raw feedback signals. Despite their conceptual popularity, RL methods are scarcely found in real-world data processing systems. Recently, RL has seen explosive growth in interest due to high profile successes when utilising large neural networks (deep reinforcement learning). Newly emerging machine learning frameworks and powerful hardware accelerators have given rise to a plethora of new potential applications. In this dissertation, I first argue that in order to design and execute deep RL algorithms efficiently, novel software abstractions are required which can accommodate the distinct computational patterns of communication-intensive and fast-evolving algorithms. I propose an architecture which decouples logical algorithm construction from local and distributed execution semantics. I further present RLgraph, my proof-of-concept implementation of this architecture. In RLgraph, algorithm developers can explore novel designs by constructing a high-level data flow graph through combination of logical components. This dataflow graph is independent of specific backend frameworks or notions of execution, and is only later mapped to execution semantics via a staged build process. RLgraph enables high-performing algorithm implementations while maintaining flexibility for rapid prototyping. Second, I investigate reasons for the scarcity of RL applications in systems themselves. I argue that progress in applied RL is hindered by a lack of tools for task model design which bridge the gap between systems and algorithms, and also by missing shared standards for evaluation of model capabilities. I introduce Wield, a first-of-its-kind tool for incremental model design in applied RL. Wield provides a small set of primitives which decouple systems interfaces and deployment-specific configuration from representation. Core to Wield is a novel instructive experiment protocol called progressive randomisation which helps practitioners to incrementally evaluate different dimensions of non-determinism. I demonstrate how Wield and progressive randomisation can be used to reproduce and assess prior work, and to guide implementation of novel RL applications

Apollo (Cambridge)

Planning For Non-Player Characters By Learning From Demonstration

Author: Drake John
Publication venue: ScholarlyCommons
Publication date: 01/01/2018
Field of study

In video games, state of the art non-player character (NPC) behavior generation typically depends on hard-coding NPC actions. In many game situations however, it is hard to foresee how an NPC should behave to appear intelligent or to accommodate human preferences for NPC behavior. We advocate the creation of a more flexible method to allow players (and developers) to train NPCs to execute novel behaviors which are not hard-coded. In particular, we investigate search-based planning approaches using demonstration to guide the search through high-dimensional spaces that represent the full state of the game. To this end, we developed the Training Graph heuristic, an extension of the Experience Graph heuristic, that guides a search smoothly and effectively even when a demonstration is unreachable in the search space, and ensures that more of the demonstrations are utilized to better train the NPC\u27s behavior. To deal with variance in the initial conditions of such planning problems, we have developed heuristics in the Multi-Heuristic A* framework to adapt demonstration trace data to new problems. We evaluate our approach in the Creation Engine game engine by modifying The Elder Scrolls V: Skyrim (Skyrim) to accommodate our NPC behavior generators and experiments. In Skyrim, players are given quests which are composed of several objectives. NPCs in the game sometimes accompany the player on quests, but state-of-the-art companion NPC AI is not sophisticated enough to behave according to arbitrary player desires. We hope that our work will lead to the creation of trainable NPC AI. This will enable novel gameplay mechanics for video game players and may augment video game production by allowing developers to train NPCs instead of hard-coding complex behaviors

ScholarlyCommons@Penn

A Review of Symbolic, Subsymbolic and Hybrid Methods for Sequential Decision Making

Author: Fernández-Olivares Juan
Mesejo Pablo
Núñez-Molina Carlos
Publication venue
Publication date: 20/04/2023
Field of study

The field of Sequential Decision Making (SDM) provides tools for solving Sequential Decision Processes (SDPs), where an agent must make a series of decisions in order to complete a task or achieve a goal. Historically, two competing SDM paradigms have view for supremacy. Automated Planning (AP) proposes to solve SDPs by performing a reasoning process over a model of the world, often represented symbolically. Conversely, Reinforcement Learning (RL) proposes to learn the solution of the SDP from data, without a world model, and represent the learned knowledge subsymbolically. In the spirit of reconciliation, we provide a review of symbolic, subsymbolic and hybrid methods for SDM. We cover both methods for solving SDPs (e.g., AP, RL and techniques that learn to plan) and for learning aspects of their structure (e.g., world models, state invariants and landmarks). To the best of our knowledge, no other review in the field provides the same scope. As an additional contribution, we discuss what properties an ideal method for SDM should exhibit and argue that neurosymbolic AI is the current approach which most closely resembles this ideal method. Finally, we outline several proposals to advance the field of SDM via the integration of symbolic and subsymbolic AI

arXiv.org e-Print Archive

Reducing the Barrier to Entry of Complex Robotic Software: a MoveIt! Case Study

Author: Chitta Sachin
Coleman David
Correll Nikolaus
Sucan Ioan
Publication venue
Publication date: 14/04/2014
Field of study

Developing robot agnostic software frameworks involves synthesizing the disparate fields of robotic theory and software engineering while simultaneously accounting for a large variability in hardware designs and control paradigms. As the capabilities of robotic software frameworks increase, the setup difficulty and learning curve for new users also increase. If the entry barriers for configuring and using the software on robots is too high, even the most powerful of frameworks are useless. A growing need exists in robotic software engineering to aid users in getting started with, and customizing, the software framework as necessary for particular robotic applications. In this paper a case study is presented for the best practices found for lowering the barrier of entry in the MoveIt! framework, an open-source tool for mobile manipulation in ROS, that allows users to 1) quickly get basic motion planning functionality with minimal initial setup, 2) automate its configuration and optimization, and 3) easily customize its components. A graphical interface that assists the user in configuring MoveIt! is the cornerstone of our approach, coupled with the use of an existing standardized robot model for input, automatically generated robot-specific configuration files, and a plugin-based architecture for extensibility. These best practices are summarized into a set of barrier to entry design principles applicable to other robotic software. The approaches for lowering the entry barrier are evaluated by usage statistics, a user survey, and compared against our design objectives for their effectiveness to users

arXiv.org e-Print Archive

Artificial Intelligence Research Branch future plans

Author: Stewart Helen
Publication venue
Publication date
Field of study

This report contains information on the activities of the Artificial Intelligence Research Branch (FIA) at NASA Ames Research Center (ARC) in 1992, as well as planned work in 1993. These activities span a range from basic scientific research through engineering development to fielded NASA applications, particularly those applications that are enabled by basic research carried out in FIA. Work is conducted in-house and through collaborative partners in academia and industry. All of our work has research themes with a dual commitment to technical excellence and applicability to NASA short, medium, and long-term problems. FIA acts as the Agency's lead organization for research aspects of artificial intelligence, working closely with a second research laboratory at the Jet Propulsion Laboratory (JPL) and AI applications groups throughout all NASA centers. This report is organized along three major research themes: (1) Planning and Scheduling: deciding on a sequence of actions to achieve a set of complex goals and determining when to execute those actions and how to allocate resources to carry them out; (2) Machine Learning: techniques for forming theories about natural and man-made phenomena; and for improving the problem-solving performance of computational systems over time; and (3) Research on the acquisition, representation, and utilization of knowledge in support of diagnosis design of engineered systems and analysis of actual systems

NASA Technical Reports Server

Data-driven prognostics and logistics optimisation:A deep learning journey

Author: de Oliveira da Costa Paulo Roberto
Publication venue: Eindhoven University of Technology
Publication date: 03/02/2022
Field of study

Pure OAI Repository