58 research outputs found

    Probabilistic policy reuse for safe reinforcement learning

    Get PDF
    This work introducesPolicy Reuse for Safe Reinforcement Learning, an algorithm that combines ProbabilisticPolicy Reuse and teacher advice for safe exploration in dangerous and continuous state and action reinforce-ment learning problems in which the dynamic behavior is reasonably smooth and the space is Euclidean. Thealgorithm uses a continuously increasing monotonic risk function that allows for the identification of theprobability to end up in failure from a given state. Such a risk function is defined in terms of how far such astate is from the state space known by the learning agent. Probabilistic Policy Reuse is used to safely balancethe exploitation of actual learned knowledge, the exploration of new actions, and the request of teacher advicein parts of the state space considered dangerous. Specifically, thepi-reuse exploration strategy is used. Usingexperiments in the helicopter hover task and a business management problem, we show that thepi-reuseexploration strategy can be used to completely avoid the visit to undesirable situations while maintainingthe performance (in terms of the classical long-term accumulated reward) of the final policy achieved.This paper has been partially supported by the Spanish Ministerio de Economía y Competitividad TIN2015-65686-C5-1-R and the European Union’s Horizon 2020 Research and Innovation programme under Grant Agreement No. 730086 (ERGO). Javier García is partially supported by the Comunidad de Madrid (Spain) funds under the project 2016-T2/TIC-1712

    A taxonomy for similarity metrics between Markov decision processes

    Get PDF
    Although the notion of task similarity is potentially interesting in a wide range of areas such as curriculum learning or automated planning, it has mostly been tied to transfer learning. Transfer is based on the idea of reusing the knowledge acquired in the learning of a set of source tasks to a new learning process in a target task, assuming that the target and source tasks are close enough. In recent years, transfer learning has succeeded in making reinforcement learning (RL) algorithms more efficient (e.g., by reducing the number of samples needed to achieve (near-)optimal performance). Transfer in RL is based on the core concept of similarity: whenever the tasks are similar, the transferred knowledge can be reused to solve the target task and significantly improve the learning performance. Therefore, the selection of good metrics to measure these similarities is a critical aspect when building transfer RL algorithms, especially when this knowledge is transferred from simulation to the real world. In the literature, there are many metrics to measure the similarity between MDPs, hence, many definitions of similarity or its complement distance have been considered. In this paper, we propose a categorization of these metrics and analyze the definitions of similarity proposed so far, taking into account such categorization. We also follow this taxonomy to survey the existing literature, as well as suggesting future directions for the construction of new metricsOpen Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work has also been supported by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with UC3M in the line of Excellence of University Professors (EPUC3M17), and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation)S

    On-line case-based policy learning for automated planning in probabilistic environments

    Get PDF
    Many robotic control architectures perform a continuous cycle of sensing, reasoning and acting, where that reasoning can be carried out in a reactive or deliberative form. Reactive methods are fast and provide the robot with high interaction and response capabilities. Deliberative reasoning is particularly suitable in robotic systems because it employs some form of forward projection (reasoning in depth about goals, pre-conditions, resources and timing constraints) and provides the robot reasonable responses in situations unforeseen by the designer. However, this reasoning, typically conducted using Artificial Intelligence techniques like Automated Planning (AP), is not effective for controlling autonomous agents which operate in complex and dynamic environments. Deliberative planning, although feasible in stable situations, takes too long in unexpected or changing situations which require re-planning. Therefore, planning cannot be done on-line in many complex robotic problems, where quick responses are frequently required. In this paper, we propose an alternative approach based on case-based policy learning which integrates deliberative reasoning through AP and reactive response time through reactive planning policies. The method is based on learning planning knowledge from actual experiences to obtain a case-based policy. The contribution of this paper is two fold. First, it is shown that the learned case-based policy produces reasonable and timely responses in complex environments. Second, it is also shown how one case-based policy that solves a particular problem can be reused to solve a similar but more complex problem in a transfer learning scope.This paper has been partially supported by the Spanish Ministerio de Econom a y Competitividad TIN2015-65686-C5-1-R and the European Union's Horizon 2020 Research and Innovation programme under Grant Agreement No. 730086 (ERGO)

    Medial deviation of the first metatarsal in incipient hallux valgus deformity

    Get PDF
    The intermetatarsal angle between metatarsals I and II (IMA 1-2) has been radiographicaly studied in 49 normal feet and in 49 feet with a mild hallux valgus (HV) deformity. The aim of the study is to know whether an excessive medial deviation of the first metatarsal with respect to II (IMA 1-2 over normal values reported by some authors) is present in the initial phase of HV. The results demonstrate that the difference in the mean intermetatarsal angle between the two groups is statistically significant (8.76º in normal feet; 9.98º in affected feet). However, the authors think it is not clinically significant. Other authors, comparing the IMA 1-2 in patients with more advanced HV and without HV, report greater differences than those obtained in this study,. The authors conclude that the excessive medial deviation of the first metatarsal is not a causal factor, but a consequence, in the HV deformity

    Análisis histórico administrativo de la united fruit en la zona bananera de la Magdalena 1899-1966

    Get PDF
    El presente trabajo nos lleva a formular unos antecedentes históricos del proceso de origen, desarrollo y crisis de la Zona Bananera como un hecho de vital importancia en el desarrollo de la historia de Colombia. Es así como se toman referentes, partiendo de la constitución de las primeras organizaciones agrícolas locales y extranjeras en el Magdalena, hasta la organización de la United Fruit Company, como empresa agrícola exportadora de banano y generadora de empleo, que viene a influir en el aspecto socio-económico de la región. En este mismo orden de ideas, el trabajo comprende el estudio de las consecuencias que como tal trajo consigo la compañía bananera, que encontró en las regiones de Río Frío, Aracataca, Sevilla, Guacamayal, entre otras las condiciones aptas para ubicarse en la zona. Se hace una sustentación teórica de cómo se originó y creó el proceso bananero en esta parte del territorio colombiano, abordando de igual forma el impacto causado por la compañía extranjera en la región del Magdalena en el aspecto social y económico y su responsabilidad en el auge y deterioro de esta en las primeras seis décadas del siglo XX Los diferentes referentes teóricos que se tocan en este trabajo fueron extraídos de una gran consulta bibliográfica procedente de la biblioteca de la Universidad del Magdalena Infotep, Ciénaga, Banco de la República, bibliotecas particulares, servicios de Internet, recopilaciones bibliográficas personales y entrevistas con ex trabajadores de United Fruit Company en Ciénaga, Magdalena. Al respecto conviene decir que este trabajo se realiza porque llena las expectativas asumidas por los autores, los cuales se identifican con el área de estudio en el cual se están formando, esperando que a través de él se logre profundizar un poco más acerca de los aspectos concernientes a la historia económica de la zona bananera de principios y mitad del siglo veinte, que nos lleve a comprender mejor nuestro presente, conociendo los eventos del pasado

    Challenges on the application of automated planning for comprehensive geriatric assessment using an autonomous social robot

    Get PDF
    November 22-23, 2018, Madrid, SpainComprehensive Geriatric Assessment is a medical procedure to evaluate the physical, social and psychological status of elder patients. One of its phases consists of performing different tests to the patient or relatives. In this paper we present the challenges to apply Automated Planning to control an autonomous robot helping the clinician to perform such tests. On the one hand the paper focuses on the modelling decisions taken, from an initial approach where each test was encoded using slightly different domains, to the final unified domain allowing any test to be represented. On the other hand, the paper deals with practical issues arisen when executing the plans. Preliminary tests performed with real users show that the proposed approach is able to seamlessly handle the patient-robot interaction in real time, recovering from unexpected events and adapting to the users' preferred input method, while being able to gather all the information needed by the clinician.This work has been partially funded by the European Union ECHORD++ project (FP7-ICT-601116) and the TIN2015-65686-C5 Spanish Ministerio de Economía y Competitividad project. Javier García is partially supported by the Comunidad de Madrid (Spain) funds under the project 2016-T2/TIC-1712

    An Automated Planning Model for HRI: Use Cases on Social Assistive Robotics

    Get PDF
    Using Automated Planning for the high level control of robotic architectures is becoming very popular thanks mainly to its capability to define the tasks to perform in a declarative way. However, classical planning tasks, even in its basic standard Planning Domain Definition Language (PDDL) format, are still very hard to formalize for non expert engineers when the use case to model is complex. Human Robot Interaction (HRI) is one of those complex environments. This manuscript describes the rationale followed to design a planning model able to control social autonomous robots interacting with humans. It is the result of the authors’ experience in modeling use cases for Social Assistive Robotics (SAR) in two areas related to healthcare: Comprehensive Geriatric Assessment (CGA) and non-contact rehabilitation therapies for patients with physical impairments. In this work a general definition of these two use cases in a unique planning domain is proposed, which favors the management and integration with the software robotic architecture, as well as the addition of new use cases. Results show that the model is able to capture all the relevant aspects of the Human-Robot interaction in those scenarios, allowing the robot to autonomously perform the tasks by using a standard planning-execution architecture.This work has been partially funded by the European Union ECHORD++ project (FP7-ICT-601116), and grants TIN2017-88476-C2-2-R and RTI2018-099522-B-C43 of FEDER/Ministerio de Ciencia e Innovación-Ministerio de Universidades-Agencia Estatal de Investigación. Javier García is partially supported by the Comunidad de Madrid funds under the project 2016-T2/TIC-1712

    Agenesis of the corpus callosum in a newborn with Turner mosaicism

    Get PDF
    The agenesis of the corpus callosum results from a failure in the development of the largest fiber bundle that connects cerebral hemispheres. Patient’s outcome is influenced by etiology and associated central nervous system malformations. We describe a child with Turner syndrome (TS) mosaicism, with particular phenotype features and a complete agenesis of the corpus callosum. To our knowledge, this is the second case report of TS mosaicism associated with complete agenesis of the corpus callosum. Anatomical brain magnetic resonance imaging and diffusion tensor imaging were useful to confirm the complete absence of the corpus callosum, evaluate associated central nervous system malformations, visualize abnormal white matter tracts (Probst bundles) and assess the remaining commissures

    Delineating the application of ultrasound in detecting synovial abnormalities of subtalar joint in juvenile idiopathic arthritis

    Get PDF
    Objective: To investigate the frequency of ultrasound (US)\u2013detectable involvement of the subtalar joint (STJ), to compare clinical versus US assessment of the STJ, and to compare different scanning approaches to the STJ in juvenile idiopathic arthritis (JIA). Methods: Clinical and US assessments were performed independently in 50 ankles with clinically active JIA. US abnormalities of the STJ were investigated using a lateral, medial, and posterior scanning approach and scored semiquantitatively. Agreement was tested using kappa statistics. A control group of 10 healthy subjects was examined. Results: Clinical and US evaluations detected synovitis in 24 of 50 (48.0%) and 27 of 50 (54.0%) of STJs, respectively. US detected synovitis in 10 of 26 STJs (38.5%) recorded as normal on clinical evaluation, but was negative in 7 of 24 STJs (29.2%) diagnosed as having involvement on clinical examination. Agreement between clinical and US assessments was fair (\u3ba = 0.32). US abnormalities were more frequently detectable using the lateral scanning approach. All patients with US abnormalities in the medial and/or posterior side of the STJ had also US abnormalities on the lateral scanning approach, but the reverse was not true. Intra- and interobserver agreements for the lateral scanning approach were satisfactory for both detecting involvement and scoring US abnormalities. None of the 17 STJs of healthy controls showed US abnormalities. Conclusion: US may increase the precision of the evaluation of the STJ in JIA. The observed high frequency of STJ involvement on US suggests to include this joint in US scanning protocols devised for children with JIA. Synovitis is more frequently detected using the lateral scanning approach. \ua9 2016, American College of Rheumatolog

    Schistosomiasis in Africa: Improving strategies for long-term and sustainable morbidity control

    Get PDF
    Schistosomiasis affects over 200 million people worldwide [1] and accounts for an estimated 1.9 million disability-adjusted life years (DALYs) annually [2], with 90% of the burden currently concentrated in Africa. The last decade has witnessed an extraordinary surge of advocacy and funding for neglected tropical diseases (NTDs), including schistosomiasis. Large-scale schistosomiasis control is now implemented in 30 countries in Africa [1], funded primarily through support from the United States Agency for International Development (USAID) and the Department for International Development (DFID), private philanthropic funds from the END Fund and through GiveWell recommendations, and leveraging praziquantel donations from Merck KGaA. However, the number of people still requiring treatment remains daunting [1]. The aim of current public health strategies for schistosomiasis is to decrease morbidity through preventive chemotherapy (PC) (Fig 1) [3]. Periodic large-scale administration of the drug praziquantel focusing on the school-aged population and high-risk adults aims to reduce the prevalence and intensity of infection [4]
    corecore