38 research outputs found

    Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

    Full text link
    A mechanism called Eligibility Propagation is proposed to speed up the Time Hopping technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hopping similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph. Experiments on a simulated biped crawling robot confirm that Eligibility Propagation accelerates the learning process more than 3 times.Comment: 7 page

    Managing uncertainty in sound based control for an autonomous helicopter

    Get PDF
    In this paper we present our ongoing research using a multi-purpose, small and low cost autonomous helicopter platform (Flyper ). We are building on previously achieved stable control using evolutionary tuning. We propose a sound based supervised method to localise the indoor helicopter and extract meaningful information to enable the helicopter to further stabilise its flight and correct its flightpath. Due to the high amount of uncertainty in the data, we propose the use of fuzzy logic in the signal processing of the sound signature. We discuss the benefits and difficulties using type-1 and type-2 fuzzy logic in this real-time systems and give an overview of our proposed system

    Robustness analysis of evolutionary controller tuning using real systems

    Get PDF
    A genetic algorithm (GA) presents an excellent method for controller parameter tuning. In our work, we evolved the heading as well as the altitude controller for a small lightweight helicopter. We use the real flying robot to evaluate the GA's individuals rather than an artificially consistent simulator. By doing so we avoid the ldquoreality gaprdquo, taking the controller from the simulator to the real world. In this paper we analyze the evolutionary aspects of this technique and discuss the issues that need to be considered for it to perform well and result in robust controllers

    CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

    Full text link
    In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control. CLIC learns to control individual objects in its environment, and imitates Bob's interactions with these objects. It selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC is an effective baseline in our new setting. It can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. It can also follow Bob when he acts as a mentor and provides ordered demonstrations. Finally, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, we show that CLIC ignores non-reproducible and already mastered interactions with objects, resulting in a greater benefit from imitation

    Autonomous Vehicle Coordination with Wireless Sensor and Actuator Networks

    Get PDF
    A coordinated team of mobile wireless sensor and actuator nodes can bring numerous benefits for various applications in the field of cooperative surveillance, mapping unknown areas, disaster management, automated highway and space exploration. This article explores the idea of mobile nodes using vehicles on wheels, augmented with wireless, sensing, and control capabilities. One of the vehicles acts as a leader, being remotely driven by the user, the others represent the followers. Each vehicle has a low-power wireless sensor node attached, featuring a 3D accelerometer and a magnetic compass. Speed and orientation are computed in real time using inertial navigation techniques. The leader periodically transmits these measures to the followers, which implement a lightweight fuzzy logic controller for imitating the leader's movement pattern. We report in detail on all development phases, covering design, simulation, controller tuning, inertial sensor evaluation, calibration, scheduling, fixed-point computation, debugging, benchmarking, field experiments, and lessons learned
    corecore