63,646 research outputs found

    Chance-Constrained Control with Lexicographic Deep Reinforcement Learning

    Get PDF
    This paper proposes a lexicographic Deep Reinforcement Learning (DeepRL)-based approach to chance-constrained Markov Decision Processes, in which the controller seeks to ensure that the probability of satisfying the constraint is above a given threshold. Standard DeepRL approaches require i) the constraints to be included as additional weighted terms in the cost function, in a multi-objective fashion, and ii) the tuning of the introduced weights during the training phase of the Deep Neural Network (DNN) according to the probability thresholds. The proposed approach, instead, requires to separately train one constraint-free DNN and one DNN associated to each constraint and then, at each time-step, to select which DNN to use depending on the system observed state. The presented solution does not require any hyper-parameter tuning besides the standard DNN ones, even if the probability thresholds changes. A lexicographic version of the well-known DeepRL algorithm DQN is also proposed and validated via simulations

    Competitive function approximation for reinforcement learning

    Get PDF
    The application of reinforcement learning to problems with continuous domains requires representing the value function by means of function approximation. We identify two aspects of reinforcement learning that make the function approximation process hard: non-stationarity of the target function and biased sampling. Non-stationarity is the result of the bootstrapping nature of dynamic programming where the value function is estimated using its current approximation. Biased sampling occurs when some regions of the state space are visited too often, causing a reiterated updating with similar values which fade out the occasional updates of infrequently sampled regions. We propose a competitive approach for function approximation where many different local approximators are available at a given input and the one with expectedly best approximation is selected by means of a relevance function. The local nature of the approximators allows their fast adaptation to non-stationary changes and mitigates the biased sampling problem. The coexistence of multiple approximators updated and tried in parallel permits obtaining a good estimation much faster than would be possible with a single approximator. Experiments in different benchmark problems show that the competitive strategy provides a faster and more stable learning than non-competitive approaches.Preprin

    Black-Box Data-efficient Policy Search for Robotics

    Get PDF
    The most data-efficient algorithms for reinforcement learning (RL) in robotics are based on uncertain dynamical models: after each episode, they first learn a dynamical model of the robot, then they use an optimization algorithm to find a policy that maximizes the expected return given the model and its uncertainties. It is often believed that this optimization can be tractable only if analytical, gradient-based algorithms are used; however, these algorithms require using specific families of reward functions and policies, which greatly limits the flexibility of the overall approach. In this paper, we introduce a novel model-based RL algorithm, called Black-DROPS (Black-box Data-efficient RObot Policy Search) that: (1) does not impose any constraint on the reward function or the policy (they are treated as black-boxes), (2) is as data-efficient as the state-of-the-art algorithm for data-efficient RL in robotics, and (3) is as fast (or faster) than analytical approaches when several cores are available. The key idea is to replace the gradient-based optimization algorithm with a parallel, black-box algorithm that takes into account the model uncertainties. We demonstrate the performance of our new algorithm on two standard control benchmark problems (in simulation) and a low-cost robotic manipulator (with a real robot).Comment: Accepted at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2017; Code at http://github.com/resibots/blackdrops; Video at http://youtu.be/kTEyYiIFGP

    SSA-ME Detection of cancer driver genes using mutual exclusivity by small subnetwork analysis

    Get PDF
    Because of its clonal evolution a tumor rarely contains multiple genomic alterations in the same pathway as disrupting the pathway by one gene often is sufficient to confer the complete fitness advantage. As a result, many cancer driver genes display mutual exclusivity across tumors. However, searching for mutually exclusive gene sets requires analyzing all possible combinations of genes, leading to a problem which is typically too computationally complex to be solved without a stringent a priori filtering, restricting the mutations included in the analysis. To overcome this problem, we present SSA-ME, a network-based method to detect cancer driver genes based on independently scoring small subnetworks for mutual exclusivity using a reinforced learning approach. Because of the algorithmic efficiency, no stringent upfront filtering is required. Analysis of TCGA cancer datasets illustrates the added value of SSA-ME: well-known recurrently mutated but also rarely mutated drivers are prioritized. We show that using mutual exclusivity to detect cancer driver genes is complementary to state-of-the art approaches. This framework, in which a large number of small subnetworks are being analyzed in order to solve a computationally complex problem (SSA), can be generically applied to any problem in which local neighborhoods in a network hold useful information

    Factors that affect motivation towards english language acquisition in seventh grade students of a public elementary school in Parral

    Get PDF
    Tesis (Magíster en la enseñanza del inglés como lengua extranjera)The research presents the results of the identification and analysis of factors that characterize the motivation for the English Foreign Language Acquisition of seventh year students who belong to a Public Elementary school of Parral, seventh region, Maule in Chile. To investigate the factors that influence students’ motivation a mixed method research was carried out. The data was collected and analysed through qualitative approach and organized and presented in a quantitative manner represented by graphics. The information was compiled by two previously validated instruments, which consisted of a questionnaire for the teachers of the different subjects of the class and the psychosocial team who works with the students. A personal interview was applied to each student. Two major conclusions were obtained from the results of the analysis of the data collection; firstly students present a lack of motivation towards the subject of English as a Foreign Language as a product of the sociocultural environment in which they are immersed, secondly learners are exposed to language learning from puberty and not from the beginning of their first learning stages as postulates the Critical Period Hypothesis (CPH).La investigación presenta los resultados de la identificación y análisis de los factores que caracterizan la motivación hacia la adquisición del inglés como lengua extranjera de alumnos de séptimo año básico pertenecientes a un colegio básico y público de la comuna de Parral, séptima región del Maule en Chile. Para investigar los factores que inciden en la motivación de los estudiantes se utilizó un enfoque mixto tanto cualitativo para la recolección y análisis de los datos y cuantitativo para la organización y presentación de la información representada en gráficos. La obtención de la información se hizo mediante dos instrumentos previamente validados, los cuales consistieron en un cuestionario para los profesores de los diferentes sectores de aprendizaje del curso y para el equipo sicosocial que trabaja con los estudiantes. Una entrevista personal fue aplicada a cada alumno. Dos grandes conclusiones se obtuvieron del resultado del análisis de la recolección de datos; la primera es la falta de motivación de los alumnos hacia la asignatura de inglés como lengua extranjera producto del entorno sociocultural en el cual están inmersos y la segunda es que los alumnos son expuestos al aprendizaje de la lengua desde el inicio de la pubertad y no desde sus primeras etapas de aprendizaje como postula la hipótesis del período crítico
    • …
    corecore