6 research outputs found

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field

    Decisional issues during human-robot joint action

    Get PDF
    In the future, robots will become our companions and co-workers. They will gradually appear in our environment, to help elderly or disabled people or to perform repetitive or unsafe tasks. However, we are still far from a real autonomous robot, which would be able to act in a natural, efficient and secure manner with humans. To endow robots with the capacity to act naturally with human, it is important to study, first, how humans act together. Consequently, this manuscript starts with a state of the art on joint action in psychology and philosophy before presenting the implementation of the principles gained from this study to human-robot joint action. We will then describe the supervision module for human-robot interaction developed during the thesis. Part of the work presented in this manuscript concerns the management of what we call a shared plan. Here, a shared plan is a a partially ordered set of actions to be performed by humans and/or the robot for the purpose of achieving a given goal. First, we present how the robot estimates the beliefs of its humans partners concerning the shared plan (called mental states) and how it takes these mental states into account during shared plan execution. It allows it to be able to communicate in a clever way about the potential divergent beliefs between the robot and the humans knowledge. Second, we present the abstraction of the shared plans and the postponing of some decisions. Indeed, in previous works, the robot took all decisions at planning time (who should perform which action, which object to use…) which could be perceived as unnatural by the human during execution as it imposes a solution preferentially to any other. This work allows us to endow the robot with the capacity to identify which decisions can be postponed to execution time and to take the right decision according to the human behavior in order to get a fluent and natural robot behavior. The complete system of shared plans management has been evaluated in simulation and with real robots in the context of a user study. Thereafter, we present our work concerning the non-verbal communication needed for human-robot joint action. This work is here focused on how to manage the robot head, which allows to transmit information concerning what the robot's activity and what it understands of the human actions, as well as coordination signals. Finally, we present how to mix planning and learning in order to allow the robot to be more efficient during its decision process. The idea, inspired from neuroscience studies, is to limit the use of planning (which is adapted to the human-aware context but costly) by letting the learning module made the choices when the robot is in a "known" situation. The first obtained results demonstrate the potential interest of the proposed solution

    Robo-CAMAL : anchoring in a cognitive robot

    Get PDF
    The CAMAL architecture (Computational Architectures for Motivation,Affect and Learning) provides an excellent framework within which to explore and investigate issues relevant to cognitive science and artificial intelligence. This thesis describes a small sub element of the CAMAL architecture that has been implemented on a mobile robot. The first area of investigation within this research relates to the anchoring problem. Can the robotic agent generate symbols based on responses within its perceptual systems and can it reason about its environment based on those symbols? Given that the agent can identify changes within its environment, can it then adapt its behaviour and alter its goals to mirror the change in its environment? The second area of interest involves agent learning. The agent has a domain model that details its goals, the actions it can perform and some of the possible environmental states it may encounter. The agent is not provided with the belief-goal-action combinations in order to achieve its goals. The agent is also unaware of the effect its actions have upon its environment. Can the agent experiment with its behaviour to generate its own belief-goal-action combinations that allow it to achieve its goals? A second related problem involves the case where the belief-goal-action combination is pre-programmed. This is when the agent is provided with several different methods with which to achieve a specific goal. Can the agent learn which combination is the best? This thesis will describe the sub-element of the CAMAL architecture that was developed for a robot (robo-CAMAL). It will also demonstrate how robo-CAMAL solves the anchoring problem, and learns how to act and adapt in its environment

    Reinforcement Learning based Adaptive Model Predictive Power Pinch Analysis Systems Level Energy Management Approach to Uncertainty in Isolated Hybrid Energy Storage Systems

    Get PDF
    Ph. D. ThesisHybrid energy storage systems (HESS) involves the integration of multiple energy storage technologies with different complementary characteristics which are significantly advantageous compared to a single energy storage system, and can greatly improve the reliability of intermittent renewable energy sources (RES). Aside from the advantages HESS offer, the control and coordination of the multiple energy storages and the vital elements of the system via an optimised energy management strategy (EMS) involves increased computational time. Nevertheless, a systems-level graphical EMS based on Power Pinch Analysis (PoPA) which is a low burden computational tool was recently proposed for HESS. In this respect, the EMS which effectively resolved deficit and excess energy objectives was effected via the graphical PoPA tool, the power grand composite curve (PGCC). PGCC is basically a plot of integrated energy demands and sources in the system as a function of time. Although of proven success, accounting for uncertainty with PoPA is a cogent research question due to the assumption of an ideal day ahead (DA) generation and load profiles forecast. Therefore, the proposition of several graphical and reinforcement learning based ‘adaptive’ PoPA EMSs in order to address the issue of uncertainty with PoPA, has been the major contribution of this thesis. Firstly, to counteract the combined effect of uncertainty with PoPA, an Adaptive PoPA EMS for a standalone HESS has been proposed. In the Adaptive PoPA, the PGCC was implemented within a receding horizon model predictive framework with the current output state of the energy storage (in this case the battery) used as control feedback to derive an updated sequence of EMS, inferred via PGCC shaping. Additionally, during the control and operation of the HESS, re-computation of the PGCC only occurs if a forecast uncertainty occurs such that the error between the real and estimated battery’s state of charge becomes greater than an arbitrarily chosen threshold value of 5%. Secondly a Kalman filter for the optimal estimation of uncertainty distributed as a normal Gaussian is integrated into the Adaptive PoPA in order to recursively predict the State of Charge of the battery based on the likelihood of uncertainty. Thus, the Kalman filter Adaptive PoPA by anticipating the effect of uncertainty offers an improved approach to the Adaptive PoPA particularly when the uncertainty is of a Gaussian distribution. The algorithm is therefore more sophisticated than the Adaptive PoPA but nevertheless computationally efficient and offers a preventive measure as an improvement. Furthermore, Tabular Dyna Q-learning algorithm, a subset of reinforcement learning which employs a learning agent to solve a discrete Markov Decision Process by maximising an expected reward in accordance with the Bellman optimality, is integrated within the Power Pinch Analysis. Thereafter, a deep neural network is used to approximate the Q-Learning Table. These aforementioned methods which have been highlighted in order of computational time can be deployed with only a minimal level of historical data requirements such as the average load profile or base load data and solar irradiance forecast to produce a deterministic solution. Nevertheless, this thesis proposed a probabilistic adaptive PoPA strategy based on a (recursive least square) Monte Carlo simulation chance constrained framework, in the event where there is sufficient amount of historical data such as the probability distribution of the uncertain model parameters. The probabilistic approach is no doubt more computationally intensive than the deterministic methods presented though it proffers a much more realistic solution to the problem of uncertainty. In order to enhance the probabilistic adaptive PoPA, an actor-critic deep neural network reinforcement learning agent is incorporated. The six methods are evaluated against the DA PoPA on an actual isolated HESS microgrid built in Greece with respect to the violation of the energy storage operating constraints and plummeting carbon emission footprint.Petroleum Technology Development Funds (PTDF
    corecore