5,285 research outputs found

    Multiple landmark detection using multi-agent reinforcement learning

    Get PDF
    The detection of anatomical landmarks is a vital step for medical image analysis and applications for diagnosis, interpretation and guidance. Manual annotation of landmarks is a tedious process that requires domain-specific expertise and introduces inter-observer variability. This paper proposes a new detection approach for multiple landmarks based on multi-agent reinforcement learning. Our hypothesis is that the position of all anatomical landmarks is interdependent and non-random within the human anatomy, thus finding one landmark can help to deduce the location of others. Using a Deep Q-Network (DQN) architecture we construct an environment and agent with implicit inter-communication such that we can accommodate K agents acting and learning simultaneously, while they attempt to detect K different landmarks. During training the agents collaborate by sharing their accumulated knowledge for a collective gain. We compare our approach with state-of-the-art architectures and achieve significantly better accuracy by reducing the detection error by 50%, while requiring fewer computational resources and time to train compared to the naïve approach of training K agents separately. Code and visualizations available: https://github.com/thanosvlo/MARL-for-Anatomical-Landmark-Detectio

    Collaborative Deep Reinforcement Learning for Joint Object Search

    Full text link
    We examine the problem of joint top-down active search of multiple objects under interaction, e.g., person riding a bicycle, cups held by the table, etc.. Such objects under interaction often can provide contextual cues to each other to facilitate more efficient search. By treating each detector as an agent, we present the first collaborative multi-agent deep reinforcement learning algorithm to learn the optimal policy for joint active object localization, which effectively exploits such beneficial contextual information. We learn inter-agent communication through cross connections with gates between the Q-networks, which is facilitated by a novel multi-agent deep Q-learning algorithm with joint exploitation sampling. We verify our proposed method on multiple object detection benchmarks. Not only does our model help to improve the performance of state-of-the-art active localization models, it also reveals interesting co-detection patterns that are intuitively interpretable

    Multi-environment lifelong deep reinforcement learning for medical imaging

    Full text link
    Deep reinforcement learning(DRL) is increasingly being explored in medical imaging. However, the environments for medical imaging tasks are constantly evolving in terms of imaging orientations, imaging sequences, and pathologies. To that end, we developed a Lifelong DRL framework, SERIL to continually learn new tasks in changing imaging environments without catastrophic forgetting. SERIL was developed using selective experience replay based lifelong learning technique for the localization of five anatomical landmarks in brain MRI on a sequence of twenty-four different imaging environments. The performance of SERIL, when compared to two baseline setups: MERT(multi-environment-best-case) and SERT(single-environment-worst-case) demonstrated excellent performance with an average distance of 9.90±7.359.90\pm7.35 pixels from the desired landmark across all 120 tasks, compared to 10.29±9.0710.29\pm9.07 for MERT and 36.37±22.4136.37\pm22.41 for SERT(p<0.05p<0.05), demonstrating the excellent potential for continuously learning multiple tasks across dynamically changing imaging environments

    Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction

    Get PDF
    This paper introduces a novel neural network-based reinforcement learning approach for robot gaze control. Our approach enables a robot to learn and to adapt its gaze control strategy for human-robot interaction neither with the use of external sensors nor with human supervision. The robot learns to focus its attention onto groups of people from its own audio-visual experiences, independently of the number of people, of their positions and of their physical appearances. In particular, we use a recurrent neural network architecture in combination with Q-learning to find an optimal action-selection policy; we pre-train the network using a simulated environment that mimics realistic scenarios that involve speaking/silent participants, thus avoiding the need of tedious sessions of a robot interacting with people. Our experimental evaluation suggests that the proposed method is robust against parameter estimation, i.e. the parameter values yielded by the method do not have a decisive impact on the performance. The best results are obtained when both audio and visual information is jointly used. Experiments with the Nao robot indicate that our framework is a step forward towards the autonomous learning of socially acceptable gaze behavior.Comment: Paper submitted to Pattern Recognition Letter

    Curriculum deep reinforcement learning with different exploration strategies : a feasibility study on cardiac landmark detection

    Get PDF
    Transcatheter aortic valve implantation (TAVI) is associated with conduction abnormalities and the mechanical interaction between the prosthesis and the atrioventricular (AV) conduction path cause these life-threatening arrhythmias. Pre-operative assessment of the location of the AV conduction path can help to understand the risk of post-TAVI conduction abnormalities. As the AV conduction path is not visible on cardiac CT, the inferior border of the membranous septum can be used as an anatomical landmark. Detecting this border automatically, accurately and efficiently would save operator time and thus benefit pre-operative planning. This preliminary study was performed to identify the feasibility of 3D landmark detection in cardiac CT images with curriculum deep Q-learning. In this study, curriculum learning was used to gradually teach an artificial agent to detect this anatomical landmark from cardiac CT. This agent was equipped with a small field of view and burdened with a large ac tion-space. Moreover, we introduced two novel action-selection strategies: α-decay and action-dropout. We compared these two strategies to the already established ε-decay strategy and observed that α-decay yielded the most accurate results. Limited computational resources were used to ensure reproducibility. In order to maximize the amount of patient data, the method was cross-validated with k-folding for all three action-selection strategies. An inter-operator variability study was conducted to assess the accuracy of the metho
    corecore