Search CORE

4,273 research outputs found

Recommended from our members

Metareasoning for Planning and Execution in Autonomous Systems

Author: Svegliato Justin
Publication venue: ScholarWorks@UMass Amherst
Publication date: 21/03/2022
Field of study

Metareasoning is the process by which an autonomous system optimizes, specifically monitors and controls, its own planning and execution processes in order to operate more effectively in its environment. As autonomous systems rapidly grow in sophistication and autonomy, the need for metareasoning has become critical for efficient and reliable operation in noisy, stochastic, unstructured domains for long periods of time. This is due to the uncertainty over the limitations of their reasoning capabilities and the range of their potential circumstances. However, despite considerable progress in metareasoning as a whole over the last thirty years, work on metareasoning for planning relies on several assumptions that diminish its accuracy and practical utility in autonomous systems that operate in the real world while work on metareasoning for execution has not seen much attention yet. This dissertation therefore proposes more effective metareasoning for planning while expanding the scope of metareasoning to execution to improve the efficiency of planning and the reliability of execution in autonomous systems. In particular, we offer a two-pronged framework that introduces metareasoning for efficient planning and reliable execution in autonomous systems. We begin by proposing two forms of metareasoning for efficient planning: (1) a method that determines when to interrupt an anytime algorithm and act on the current solution by using online performance prediction and (2) a method that tunes the hyperparameters of the anytime algorithm at runtime by using deep reinforcement learning. We then propose two forms of metareasoning for reliable execution: (3) a method that recovers from exceptions that can be encountered during operation by using belief space planning and (4) a method that maintains and restores safety during operation by using probabilistic planning

ScholarWorks@UMass Amherst

Hybrid Metaheuristics Using Reinforcement Learning Applied to Salesman Traveling Problem

Author: Adriao Duarte Doria Neto
Francisco Chagas De Lima Júnior
Jorge Dantas De Melo
Publication venue: 'IntechOpen'
Publication date: 30/12/2010
Field of study

IntechOpen

Performing a piece collecting task with a Q-Learning agent

Author: Casado Robert Guillem
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

Since the early days of Artificial Intelligence (AI), researchers have tried to design intelligent machines capable of performing specific tasks with few instructions. In the 1950s, Machine Learning (ML) appeared and proposed that the goal might not be to design intelligent machines but machines that are able to learn from data. In the field of ML, Reinforcement Learning (RL) focused all the efforts on designing machines, referred to as agents, which are able to learn not from external data but from data derived from the own machine’s experiences. The key concept of RL is to force agents to learn by providing them rewards depending on the outcome of each of its experiences. Many studies have proposed different approaches to RL systems and found applications in the industrial and manufacturing domain such as supply chain management, robot navigation and control and chemical reaction optimization. The main aim of this thesis is to design an agent with a behaviour based on Reinforcement Learning, capable of performing tasks which could be extrapolated to activities and processes in an industrial environment. Specifically, the studied activity is the navigation control of a robot tasked to collect pieces placed in a two-dimensional environment. The algorithm used to guide the agent’s learning process is one of the most known and used RL methods, Q-Learning. An Artificial Neural Network (ANN) structure, the MultiLayer Perceptron (MLP), is used to approximate the values used by the agent to decide which action to take in each situation. The experiments are designed in order to validate the capability of the agent to perform the task and compare the effect and results of several improvements implemented. The results of the experiments validate the capacity of the agent to perform the task with acceptable results but indicate the agent is able to collect all the pieces in different environment configurations only when the improvements are implemented. These improvements are the addition of an experience replay memory and the observation strategy thanks, to which the agent knows what is it surrounded by. During the experimentation, comparisons between environment configurations and task complexity are done

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Model-based and model-free learning strategies for wet clutch control

Author: De Keyser Robain
Depraetere Bruno
Dutta Abhishek
Ionescu Clara-Mihaela
Nowe Ann
Pinte Gregory
Swevers Jan
Van Vaerenbergh Kevin
Wyns Bart
Zhong Yu
Publication venue: 'Elsevier BV'
Publication date: 01/01/2014
Field of study

Ghent University Academic Bibliography