Search CORE

71,070 research outputs found

Asymmetric Actor Critic for Image-Based Robot Learning

Author: Abbeel Pieter
Andrychowicz Marcin
Pinto Lerrel
Welinder Peter
Zaremba Wojciech
Publication venue
Publication date: 17/10/2017
Field of study

Deep reinforcement learning (RL) has proven a powerful technique in many sequential decision making domains. However, Robotics poses many challenges for RL, most notably training on a physical system can be expensive and dangerous, which has sparked significant interest in learning control policies using a physics simulator. While several recent works have shown promising results in transferring policies trained in simulation to the real world, they often do not fully utilize the advantage of working with a simulator. In this work, we exploit the full state observability in the simulator to train better policies which take as input only partial observations (RGBD images). We do this by employing an actor-critic training algorithm in which the critic is trained on full states while the actor (or policy) gets rendered images as input. We show experimentally on a range of simulated tasks that using these asymmetric inputs significantly improves performance. Finally, we combine this method with domain randomization and show real robot experiments for several tasks like picking, pushing, and moving a block. We achieve this simulation to real world transfer without training on any real world data.Comment: Videos of experiments can be found at http://www.goo.gl/b57WT

arXiv.org e-Print Archive

Crossref

Robust e-Voting Composition

Author: Anane Rachid
Cooke Richard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

Crossref

Coventry University Pure Portal

Recommended from our members

Linking students' timing of engagement to learning design and academic performance

Author: Huptych Michal
Nguyen Quan
Rienties Bart
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2018
Field of study

In recent years, the connection between Learning Design (LD) and Learning Analytics (LA) has been emphasized by many scholars as it could enhance our interpretation of LA findings and translate them to meaningful interventions. Together with numerous conceptual studies, a gradual accumulation of empirical evidence has indicated a strong connection between how instructors design for learning and student behaviour. Nonetheless, students' timing of engagement and its relation to LD and academic performance have received limited attention. Therefore, this study investigates to what extent students' timing of engagement aligned with instructor learning design, and how engagement varied across different levels of performance. The analysis was conducted over 28 weeks using trace data, on 387 students, and replicated over two semesters in 2015 and 2016. Our findings revealed a mismatch between how instructors designed for learning and how students studied in reality. In most weeks, students spent less time studying the assigned materials on the VLE compared to the number of hours recommended by instructors. The timing of engagement also varied, from in advance to catching up patterns. High-performing students spent more time studying in advance, while low-performing students spent a higher proportion of their time on catching-up activities. This study reinforced the importance of pedagogical context to transform analytics into actionable insights

Open Research Online (The Open University)

Oracles and query lower bounds in generalised probabilistic theories

Author: Barnum Howard
Lee Ciarán M.
Selby John H.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/07/2018
Field of study

We investigate the connection between interference and computational power within the operationally defined framework of generalised probabilistic theories. To compare the computational abilities of different theories within this framework we show that any theory satisfying three natural physical principles possess a well-defined oracle model. Indeed, we prove a subroutine theorem for oracles in such theories which is a necessary condition for the oracle to be well-defined. The three principles are: causality (roughly, no signalling from the future), purification (each mixed state arises as the marginal of a pure state of a larger system), and strong symmetry existence of non-trivial reversible transformations). Sorkin has defined a hierarchy of conceivable interference behaviours, where the order in the hierarchy corresponds to the number of paths that have an irreducible interaction in a multi-slit experiment. Given our oracle model, we show that if a classical computer requires at least n queries to solve a learning problem, then the corresponding lower bound in theories lying at the kth level of Sorkin's hierarchy is n/k. Hence, lower bounds on the number of queries to a quantum oracle needed to solve certain problems are not optimal in the space of all generalised probabilistic theories, although it is not yet known whether the optimal bounds are achievable in general. Hence searches for higher-order interference are not only foundationally motivated, but constitute a search for a computational resource beyond that offered by quantum computation.Comment: 17+7 pages. Comments Welcome. Published in special issue "Foundational Aspects of Quantum Information" in Foundations of Physic

arXiv.org e-Print Archive

Copenhagen University Research Information System

Minimisation of Multiplicity Tree Automata

Author: Kiefer Stefan
Marusic Ines
Worrell James
Publication venue
Publication date: 01/01/2017
Field of study

We consider the problem of minimising the number of states in a multiplicity tree automaton over the field of rational numbers. We give a minimisation algorithm that runs in polynomial time assuming unit-cost arithmetic. We also show that a polynomial bound in the standard Turing model would require a breakthrough in the complexity of polynomial identity testing by proving that the latter problem is logspace equivalent to the decision version of minimisation. The developed techniques also improve the state of the art in multiplicity word automata: we give an NC algorithm for minimising multiplicity word automata. Finally, we consider the minimal consistency problem: does there exist an automaton with

n

states that is consistent with a given finite sample of weight-labelled words or trees? We show that this decision problem is complete for the existential theory of the rationals, both for words and for trees of a fixed alphabet rank.Comment: Paper to be published in Logical Methods in Computer Science. Minor editing changes from previous versio

arXiv.org e-Print Archive

CiteSeerX

Episciences.org

Oxford University Research Archive

QML-Morven : A Novel Framework for Learning Qualitative Models

Author: Coghill George M.
Pang Wei
Publication venue: Department of Computing Science, University of Aberdeen
Publication date: 01/06/2012
Field of study

Publisher PD

Aberdeen University Research

Heriot Watt Pure