417 research outputs found

    Optimization of Visual Information Presentation for Visual Prosthesis

    Get PDF

    Machine Learning Methods for Image Analysis in Medical Applications, from Alzheimer\u27s Disease, Brain Tumors, to Assisted Living

    Get PDF
    Healthcare has progressed greatly nowadays owing to technological advances, where machine learning plays an important role in processing and analyzing a large amount of medical data. This thesis investigates four healthcare-related issues (Alzheimer\u27s disease detection, glioma classification, human fall detection, and obstacle avoidance in prosthetic vision), where the underlying methodologies are associated with machine learning and computer vision. For Alzheimer’s disease (AD) diagnosis, apart from symptoms of patients, Magnetic Resonance Images (MRIs) also play an important role. Inspired by the success of deep learning, a new multi-stream multi-scale Convolutional Neural Network (CNN) architecture is proposed for AD detection from MRIs, where AD features are characterized in both the tissue level and the scale level for improved feature learning. Good classification performance is obtained for AD/NC (normal control) classification with test accuracy 94.74%. In glioma subtype classification, biopsies are usually needed for determining different molecular-based glioma subtypes. We investigate non-invasive glioma subtype prediction from MRIs by using deep learning. A 2D multi-stream CNN architecture is used to learn the features of gliomas from multi-modal MRIs, where the training dataset is enlarged with synthetic brain MRIs generated by pairwise Generative Adversarial Networks (GANs). Test accuracy 88.82% has been achieved for IDH mutation (a molecular-based subtype) prediction. A new deep semi-supervised learning method is also proposed to tackle the problem of missing molecular-related labels in training datasets for improving the performance of glioma classification. In other two applications, we also address video-based human fall detection by using co-saliency-enhanced Recurrent Convolutional Networks (RCNs), as well as obstacle avoidance in prosthetic vision by characterizing obstacle-related video features using a Spiking Neural Network (SNN). These investigations can benefit future research, where artificial intelligence/deep learning may open a new way for real medical applications

    Egocentric Computer Vision and Machine Learning for Simulated Prosthetic Vision

    Get PDF
    Las prótesis visuales actuales son capaces de proporcionar percepción visual a personas con cierta ceguera. Sin pasar por la parte dañada del camino visual, la estimulación eléctrica en la retina o en el sistema nervioso provoca percepciones puntuales conocidas como “fosfenos”. Debido a limitaciones fisiológicas y tecnológicas, la información que reciben los pacientes tiene una resolución muy baja y un campo de visión y rango dinámico reducido afectando seriamente la capacidad de la persona para reconocer y navegar en entornos desconocidos. En este contexto, la inclusión de nuevas técnicas de visión por computador es un tema clave activo y abierto. En esta tesis nos centramos especialmente en el problema de desarrollar técnicas para potenciar la información visual que recibe el paciente implantado y proponemos diferentes sistemas de visión protésica simulada para la experimentación.Primero, hemos combinado la salida de dos redes neuronales convolucionales para detectar bordes informativos estructurales y siluetas de objetos. Demostramos cómo se pueden reconocer rápidamente diferentes escenas y objetos incluso en las condiciones restringidas de la visión protésica. Nuestro método es muy adecuado para la comprensión de escenas de interiores comparado con los métodos tradicionales de procesamiento de imágenes utilizados en prótesis visuales.Segundo, presentamos un nuevo sistema de realidad virtual para entornos de visión protésica simulada más realistas usando escenas panorámicas, lo que nos permite estudiar sistemáticamente el rendimiento de la búsqueda y reconocimiento de objetos. Las escenas panorámicas permiten que los sujetos se sientan inmersos en la escena al percibir la escena completa (360 grados).En la tercera contribución demostramos cómo un sistema de navegación de realidad aumentada para visión protésica ayuda al rendimiento de la navegación al reducir el tiempo y la distancia para alcanzar los objetivos, incluso reduciendo significativamente el número de colisiones de obstáculos. Mediante el uso de un algoritmo de planificación de ruta, el sistema encamina al sujeto a través de una ruta más corta y sin obstáculos. Este trabajo está actualmente bajo revisión.En la cuarta contribución, evaluamos la agudeza visual midiendo la influencia del campo de visión con respecto a la resolución espacial en prótesis visuales a través de una pantalla montada en la cabeza. Para ello, usamos la visión protésica simulada en un entorno de realidad virtual para simular la experiencia de la vida real al usar una prótesis de retina. Este trabajo está actualmente bajo revisión.Finalmente, proponemos un modelo de Spiking Neural Network (SNN) que se basa en mecanismos biológicamente plausibles y utiliza un esquema de aprendizaje no supervisado para obtener mejores algoritmos computacionales y mejorar el rendimiento de las prótesis visuales actuales. El modelo SNN propuesto puede hacer uso de la señal de muestreo descendente de la unidad de procesamiento de información de las prótesis retinianas sin pasar por el análisis de imágenes retinianas, proporcionando información útil a los ciegos. Esté trabajo está actualmente en preparación.<br /

    Road Triangle Detection for Non-Road Area Elimination Using Lane Detection and Image Multiplication

    Get PDF
    The background has become the key issue in maintaining the accuracy of final analysis for object detection in the development of an image processing algorithm. Therefore, this paper focuses on intelligent transport system (ITS), in which some of the background characteristics such as trees, road divider, and buildings interfere in the detection system algorithm. Therefore, this paper presents an algorithm that can remove the unwanted background, outside the road area boundaries for dynamic video footage. Using the onboard camera to capture the road traffic, the background is always moving in motion together with the foreground; therefore, a region of interest that focuses only on the road region needs to be established. The algorithm consists of three main components: lane detection, vanishing point and image multiplication. From the three components, other methods are applied, namely Hough transform, line intersection, image masking and image multiplication, which are combined together to create the background subtraction system. In the final analysis, the test results under various road conditions show a good detection rate and background removal

    A new approach to highway lane detection by using hough transform technique

    Get PDF
    This paper presents the development of a road lane detection algorithm using image processing techniques.This algorithm is developed based on dynamic videos, which are recorded using on-board cameras installed in vehicles for Malaysian highway conditions.The recorded videos are dynamic scenes of the background and the foreground, in which the detection of the objects, presence on the road area such as vehicles and road signs are more challenging caused by interference from background elements such as buildings, trees, road dividers and other related elements or objects. Thus, this algorithm aims to detect the road lanes for three significant parameter operations; vanishing point detection, road width measurements, and Region of Interest (ROI) of the road area, for detection purposes.The techniques used in the algorithm are image enhancement and edges extraction by Sobel filter, and the main technique for lane detection is a Hough Transform. The performance of the algorithm is tested and validated by using three videos of highway scenes in Malaysia with normal weather conditions, raining and a night-time scene, and an additional scene of a sunny rural road area. The video frame rate is 30fps with dimensions of 720p (1280x720) HD pixels. In the final achievement analysis, the test result shows a true positive rate, a TP lane detection average rate of 0.925 and the capability to be used in the final application implementation

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Electronic systems for the restoration of the sense of touch in upper limb prosthetics

    Get PDF
    In the last few years, research on active prosthetics for upper limbs focused on improving the human functionalities and the control. New methods have been proposed for measuring the user muscle activity and translating it into the prosthesis control commands. Developing the feed-forward interface so that the prosthesis better follows the intention of the user is an important step towards improving the quality of life of people with limb amputation. However, prosthesis users can neither feel if something or someone is touching them over the prosthesis and nor perceive the temperature or roughness of objects. Prosthesis users are helped by looking at an object, but they cannot detect anything otherwise. Their sight gives them most information. Therefore, to foster the prosthesis embodiment and utility, it is necessary to have a prosthetic system that not only responds to the control signals provided by the user, but also transmits back to the user the information about the current state of the prosthesis. This thesis presents an electronic skin system to close the loop in prostheses towards the restoration of the sense of touch in prosthesis users. The proposed electronic skin system inlcudes an advanced distributed sensing (electronic skin), a system for (i) signal conditioning, (ii) data acquisition, and (iii) data processing, and a stimulation system. The idea is to integrate all these components into a myoelectric prosthesis. Embedding the electronic system and the sensing materials is a critical issue on the way of development of new prostheses. In particular, processing the data, originated from the electronic skin, into low- or high-level information is the key issue to be addressed by the embedded electronic system. Recently, it has been proved that the Machine Learning is a promising approach in processing tactile sensors information. Many studies have been shown the Machine Learning eectiveness in the classication of input touch modalities.More specically, this thesis is focused on the stimulation system, allowing the communication of a mechanical interaction from the electronic skin to prosthesis users, and the dedicated implementation of algorithms for processing tactile data originating from the electronic skin. On system level, the thesis provides design of the experimental setup, experimental protocol, and of algorithms to process tactile data. On architectural level, the thesis proposes a design ow for the implementation of digital circuits for both FPGA and integrated circuits, and techniques for the power management of embedded systems for Machine Learning algorithms
    corecore