8 research outputs found
Two steps Natural Actor Critic Learning for Underwater Cable Tracking
Abstract-This paper proposes a field application of a highlevel Reinforcement Learning (RL) control system for solving the action selection problem of an autonomous robot in a cable tracking task. The underwater vehicle ICT IN EU AU V learns to perform a visual based cable tracking task in a two step learning process. First, a policy is computed by means of simulation where a hydrodynamic model of the vehicle simulates the cable following task. Once the simulated results are accurate enough, in a second step, the learnedin-simulation policy is transferred to the vehicle where the learning procedure continues in a real environment, improving the initial policy. The natural actor-critic (NAC) algorithm has been selected to solve the problem in both steps. This algorithm aims to take advantage of policy gradient and value function techniques for fast convergence. Actor's policy gradient gives convergence guarantees under function approximation and partial observability while critic's value function reduces variance of the estimates update improving the convergence process
eXiTCDSS: A framework for a workflow-based CBR for interventional Clinical Decision Support Systems and its application to TAVI
This work has been financed by the Spanish Government Commission Ministerio de Industria, Turismo y Comercio (MITyC) under the project PLAN AVANZA 2 labeled by Information Technology for European Advancement 2 (ITEA2). Also, this research project has been partially funded through the project labeled DPI2011-24929.International audienceClinical Decision Support System (CDSSs) should form an important part of the field of clinical knowledge management technologies through their capacity to support the clinical process and use of knowledge, including knowledge maintenance and continuous learning, from diagnosis and investigation through surgery, treatment and long-term care. The work presented shows a workflow-based CDSS designed to give case-specific assessment to clinicians during complex surgery or Minimally Invasive Surgery (MISs). Following a perioperative workflow, the designed software will use a Case-Based Reasoning (CBR) methodology to retrieve similar past cases from a case base to provide support at any particular point of the process. The graphical user interface allows easy navigation through the whole support progress, from the initial configuration steps to the final results organized as sets of experiments easily visualized in a user-friendly way. The eXiTCDSS tool is presented giving support to a recent complex minimally invasive surgery which is receiving growing attention lately, the Transcatheter Aortic Valve Implantation (TAVI). The results obtained are presented on a basis of a real TAVI case base of 82 patients operated at Rennes University Hospital
Métodes d'aprenentatge per reforç basats en la política i la seva aplicació a la robótica
Projecte de recerca elaborat a partir d’una estada al Robot Locomotion Group del Massachusetts Institute of Technology, Estats Units, entre març i agost del 2006. Es descriu la feina portada a terme en el camp de l'aprenentatge per reforç (RL), una metodologia molt utilitzada en aprenentatge artificial. En RL, un agent intenta maximitzar un valor escalar (càstig o premi) obtingut com a resultat de la seva interacció amb l'entorn. L'objectiu d'un sistema basat en RL és el de trobar una política d'actuació òptima que relaciona l'estat de l'entorn amb una acció determinada que maximitzi la suma de reforços futurs. El principal avantatge és que no utilitza cap base de dades conegudes, així que l'agent no rep informació sobre quina decisió triar, com succeeix en molts tipus d'aprenentatge, sinó que ha de triar per descobrir aquelles accions que tenen un valor més alt, sent molt adient en robòtica aplicada. Els principals desavantatges són uns temps de convergència sovint elevats i la manca de generalització quan tractem variables contínues. Principalment, el treball s’ha centrat en l'estudi de noves i més complexes metodologies basades en RL que combinessin dos tipus d'algorismes: els basats en funcions de valor i els representats únicament per una política d'actuació. Posteriorment s'analitzà la seva aplicabilitat en aplicacions robòtiques reals. En tots els estudis i les simulacions s’ha utilitzat un braç robòtic dissenyat i contruït al laboratori. El tipus de robot, anomenat Acrobot, és un banc de proves molt utilitzat en els camps de teoria de control i aprenentatge
An Innovative Low Cost Educational Underwater Robotics Platform for Promoting Engineering Interest among Secondary School Students
The presented article describes the design features of an educational robotics project addressed for secondary school students and carried out at the University of Girona (UdG). The project, called Underwater Robotics Workshop, is about the students building an underwater exploration robotic vehicle using low-cost materials. Its ultimate objective is to promote engineering interest among students and motivate them to direct their future studies towards engineering degrees. The main purpose of this article is to describe this activity and to promote it. Versatility and adaptation are key values as the activity has been designed to be adapted to convenience or replicated. It is a continuation work of a previously published articles, now describing different technological adaptations related to the design of the vehicle’s controller, and the gathered experiences from added workshop celebrations in the recent years. The workshop has been defined as a project-based learning approach where the students learn about physics, engineering, electronics, programming, and robotics, as well as to use all kinds of working tools, according to the maker philosophy. To date, the opinions collected from the participants encourage continuation of the activity and, at the same time, ask for the introduction of novelties to keep the workshop updated with the contents of the subjects related to technology and sciences. This project is being held for more than 13 years in the UdG. More than 800 secondary school students have participated in the activity, building about 200 underwater vehicles in more than 50 editions of the workshop
Evaluating Auction Mechanisms for the Preservation of Cost-Aware Digital Objects under Constrained Digital Preservation Budgets
Digital preservation is a field of research focused on designing strategies for maintaining digital objects accessible for general use in the coming years. Out of the many approaches to digital preservation, the present research article is a continuation work of a previously published article containing a proposal for a novel object-centered paradigm to address the digital preservation problem where digital objects share part of the responsibility for self-preservation. In the new framework, the behavior of digital objects is modeled to find the best preservation strategy. The results presented in the current article add a new economic constraint to the object behavior. Now, differently from the previous paper, migrations, copies and updates are not free to use, but subject to budget limitations to ensure the economic sustainability of the whole preservation system, forcing the now-called cost-aware digital objects for efficient management of available budget. The presented approach compares two auction-based mechanisms, a multi-unit auction and a combinatorial auction, with a simple direct purchase strategy as possible efficient behaviors for budget management. TiM, a simulated environment for running distributed digital ecosystems, is used to perform the experiments. The simulated results map the relation between the studied purchase models with the sustained quality level of digital objects, as a measure of its accessibility, together with its budget management capabilities. About the results, the best performance corresponds to the combinatorial auction model. The results are a good approach to deal with the digital preservation problem from a sustainable point of view and open the door to future implementations with other purchase strategies
Direct gradient-based reinforcement learning for robot behavior learning
Abstract: Autonomous Underwater Vehicles (AUV) represent a challenging control problem with complex, noisy, dynamics. Nowadays, not only the continuous scientific advances in underwater robotics but the increasing number of sub sea missions and its complexity ask for an automatization of submarine processes. This paper proposes a high-level control system for solving the action selection problem of an autonomous robot. The system is characterized by the use of Reinforcement Learning Direct Policy Search methods (RLDPS) for learning the internal state/action mapping of some behaviors. We demonstrate its feasibility with simulated experiments using the model of our underwater robot URIS in a target following task.
Analysis of Nature-Inspired Algorithms for Long-Term Digital Preservation
Digital preservation is a research area devoted to keeping digital assets preserved and usable for many years. Out of the many approaches to digital preservation, the present research article follows a new object-centered digital preservation paradigm where digital objects share part of the responsibility for preservation: they can move, replicate, and evolve to a higher-quality format inside a digital ecosystem. In the new framework, the behavior of digital objects needs to be modeled in order to obtain the best preservation strategy. Thus, digital objects are programmed with the mission of their own long-term self-preservation, which entails being accessible and reproducible by users at any time in the future regardless of frequent technological changes due to software and hardware upgrades. Three nature-inspired computational intelligence algorithms, based on the collective behavior of decentralized and self-organized systems, were selected for the modeling approach: multipopulation genetic algorithm, ant colony optimization, and a virus-based algorithm. TiM, a simulated environment for running distributed digital ecosystems, was used to perform the experiments. The results map the relation between the models and the expected object diversity obtained in short- and mid-term digital preservation scenarios. Comparing the results, the best performance corresponded to the multipopulation genetic algorithm. The article aims to be a first step in the digital self-preservation field. Building nature-inspired model behaviors is a good approach and opens the door to future tests with other AI-based methods