712 research outputs found

    Subgoal Identifications in Reinforcement Learning: A Survey

    Get PDF

    A dynamic neural field approach to natural and efficient human-robot collaboration

    Get PDF
    A major challenge in modern robotics is the design of autonomous robots that are able to cooperate with people in their daily tasks in a human-like way. We address the challenge of natural human-robot interactions by using the theoretical framework of dynamic neural fields (DNFs) to develop processing architectures that are based on neuro-cognitive mechanisms supporting human joint action. By explaining the emergence of self-stabilized activity in neuronal populations, dynamic field theory provides a systematic way to endow a robot with crucial cognitive functions such as working memory, prediction and decision making . The DNF architecture for joint action is organized as a large scale network of reciprocally connected neuronal populations that encode in their firing patterns specific motor behaviors, action goals, contextual cues and shared task knowledge. Ultimately, it implements a context-dependent mapping from observed actions of the human onto adequate complementary behaviors that takes into account the inferred goal of the co-actor. We present results of flexible and fluent human-robot cooperation in a task in which the team has to assemble a toy object from its components.The present research was conducted in the context of the fp6-IST2 EU-IP Project JAST (proj. nr. 003747) and partly financed by the FCT grants POCI/V.5/A0119/2005 and CONC-REEQ/17/2001. We would like to thank Luis Louro, Emanuel Sousa, Flora Ferreira, Eliana Costa e Silva, Rui Silva and Toni Machado for their assistance during the robotic experiment

    Learning High-Level Planning from Text

    Get PDF
    Comprehending action preconditions and effects is an essential step in modeling the dynamics of the world. In this paper, we express the semantics of precondition relations extracted from text in terms of planning operations. The challenge of modeling this connection is to ground language at the level of relations. This type of grounding enables us to create high-level plans based on language abstractions. Our model jointly learns to predict precondition relations from text and to perform high-level planning guided by those relations. We implement this idea in the reinforcement learning framework using feedback automatically obtained from plan execution attempts. When applied to a complex virtual world and text describing that world, our relation extraction technique performs on par with a supervised baseline, yielding an F-measure of 66% compared to the baseline’s 65%. Additionally, we show that a high-level planner utilizing these extracted relations significantly outperforms a strong, text unaware baseline – successfully completing 80% of planning tasks as compared to 69% for the baseline.National Science Foundation (U.S.) (CAREER Grant IIS-0448168)United States. Defense Advanced Research Projects Agency. Machine Reading Program (FA8750-09-C-0172, PO#4910018860)Battelle Memorial Institute (PO#300662

    Space exploration: The interstellar goal and Titan demonstration

    Get PDF
    Automated interstellar space exploration is reviewed. The Titan demonstration mission is discussed. Remote sensing and automated modeling are considered. Nuclear electric propulsion, main orbiting spacecraft, lander/rover, subsatellites, atmospheric probes, powered air vehicles, and a surface science network comprise mission component concepts. Machine, intelligence in space exploration is discussed

    An integrated approach to high integrity software verification.

    Get PDF
    Computer software is developed through software engineering. At its most precise, software engineering involves mathematical rigour as formal methods. High integrity software is associated with safety critical and security critical applications, where failure would bring significant costs. The development of high integrity software is subject to stringent standards, prescribing best practises to increase quality. Typically, these standards will strongly encourage or enforce the application of formal methods. The application of formal methods can entail a significant amount of mathematical reasoning. Thus, the development of automated techniques is an active area of research. The trend is to deliver increased automation through two complementary approaches. Firstly, lightweight formal methods are adopted, sacrificing expressive power, breadth of coverage, or both in favour of tractability. Secondly, integrated solutions are sought, exploiting the strengths of different technologies to increase automation. The objective of this thesis is to support the production of high integrity software by automating an aspect of formal methods. To develop tractable techniques we focus on the niche activity of verifying exception freedom. To increase effectiveness, we integrate the complementary technologies of proof planning and program analysis. Our approach is investigated by enhancing the SPARK Approach, as developed by Altran Praxis Limited. Our approach is implemented and evaluated as the SPADEase system. The key contributions of the thesis are summarised below: • Configurable and Sound - Present a configurable and justifiably sound approach to software verification. • Cooperative Integration - Demonstrate that more targeted and effective automation can be achieved through the cooperative integration of distinct technologies. • Proof Discovery - Present proof plans that support the verification of exception freedom. • Invariant Discovery - Present invariant discovery heuristics that support the verification of exception freedom. • Implementation as SPADEase - Implement our approach as SPADEase. • Industrial Evaluation - Evaluate SPADEase against both textbook and industrial subprograms

    Hierarchical Reinforcement Learning in Behavior and the Brain

    Get PDF
    Dissertation presented to obtain the Ph.D degree in Biology, NeuroscienceReinforcement learning (RL) has provided key insights to the neurobiology of learning and decision making. The pivotal nding is that the phasic activity of dopaminergic cells in the ventral tegmental area during learning conforms to a reward prediction error (RPE), as speci ed in the temporal-di erence learning algorithm (TD). This has provided insights to conditioning, the distinction between habitual and goal-directed behavior, working memory, cognitive control and error monitoring. It has also advanced the understanding of cognitive de cits in Parkinson's disease, depression, ADHD and of personality traits such as impulsivity.(...
    • …
    corecore