11 research outputs found

    REGION HOMOGENEITY IN THE LOGARITHMIC IMAGE PROCESSING FRAMEWORK: APPLICATION TO REGION GROWING ALGORITHMS

    Get PDF
    In order to create an image segmentation method robust to lighting changes, two novel homogeneity criteria of an image region were studied. Both were defined using the Logarithmic Image Processing (LIP) framework whose laws model lighting changes. The first criterion estimates the LIP-additive homogeneity and is based on the LIP-additive law. It is theoretically insensitive to lighting changes caused by variations of the camera exposure-time or source intensity. The second, the LIP-multiplicative homogeneity criterion, is based on the LIP-multiplicative law and is insensitive to changes due to variations of the object thickness or opacity. Each criterion is then applied in Revol and Jourlin’s (1997) region growing method which is based on the homogeneity of an image region. The region growing method becomes therefore robust to the lighting changes specific to each criterion. Experiments on simulated and on real images presenting lighting variations prove the robustness of the criteria to those variations. Compared to a state-of the art method based on the image component-tree, ours is more robust. These results open the way to numerous applications where the lighting is uncontrolled or partially controlled

    Graph Convolutional Neural Networks based on Quantum Vertex Saliency

    Full text link
    This paper proposes a new Quantum Spatial Graph Convolutional Neural Network (QSGCNN) model that can directly learn a classification function for graphs of arbitrary sizes. Unlike state-of-the-art Graph Convolutional Neural Network (GCNN) models, the proposed QSGCNN model incorporates the process of identifying transitive aligned vertices between graphs, and transforms arbitrary sized graphs into fixed-sized aligned vertex grid structures. In order to learn representative graph characteristics, a new quantum spatial graph convolution is proposed and employed to extract multi-scale vertex features, in terms of quantum information propagation between grid vertices of each graph. Since the quantum spatial convolution preserves the grid structures of the input vertices (i.e., the convolution layer does not change the original spatial sequence of vertices), the proposed QSGCNN model allows to directly employ the traditional convolutional neural network architecture to further learn from the global graph topology, providing an end-to-end deep learning architecture that integrates the graph representation and learning in the quantum spatial graph convolution layer and the traditional convolutional layer for graph classifications. We demonstrate the effectiveness of the proposed QSGCNN model in relation to existing state-of-the-art methods. The proposed QSGCNN model addresses the shortcomings of information loss and imprecise information representation arising in existing GCN models associated with the use of SortPooling or SumPooling layers. Experiments on benchmark graph classification datasets demonstrate the effectiveness of the proposed QSGCNN model

    Detection of pulmonary tuberculosis using deep learning convolutional neural networks

    Get PDF
    If Pulmonary Tuberculosis (PTB) is detected early in a patient, the greater the chances of treating and curing the disease. Early detection of PTB could result in an overall lower mortality rate. Detection of PTB is achieved in many ways, for instance, by using tests like the sputum culture test. The problem is that conducting tests like these can be a lengthy process and takes up precious time. The best and quickest PTB detection method is viewing the chest X-Ray image (CXR) of the patient. To make an accurate diagnosis requires a qualified professional Radiologist. Neural Networks have been around for several years but is only now making ground-breaking advancements in speech and image processing because of the increased processing power at our disposal. Artificial intelligence, especially Deep Learning Convolutional Neural Networks (DLCNN), has the potential to diagnose and detect the disease immediately. If DLCNN can be used in conjunction with the professional medical institutions, crucial time and effort can be saved. This project aims to determine and investigate proper methods to identify and detect Pulmonary Tuberculosis in the patient chest X-Ray images using DLCNN. Detection accuracy and success form a crucial part of the research. Simulations on an input dataset of infected and healthy patients are carried out. My research consists of firstly evaluating the colour depth and image resolution of the input images. The best resolution to use is found to be 64x64. Subsequently, a colour depth of 8 bit is found to be optimal for CXR images. Secondly, building upon the optimal resolution and colour depth, various image pre-processing techniques are evaluated. In further simulations, the pre-processed images with the best outcome are used. Thirdly the techniques evaluated are transfer learning, hyperparameter adjustment and data augmentation. Of these, the best results are obtained from data augmentation. Fourthly, a proposed hybrid approach. The hybrid method is a mixture of CAD and DLCNN using only the lung ROI images as training data. Finally, a combination of the proposed hybrid method, coupled with augmented data and specific hyperparameter adjustment, is evaluated. Overall, the best result is obtained from the proposed hybrid method combined with synthetic augmented data and specific hyperparameter adjustment.Electrical and Mining Engineerin

    Arquitectura de red neuronal convolucional para diagnóstico de cáncer de piel

    Get PDF
    El aprendizaje automático ha sido la técnica más usada últimamente en diferentes aplicativos, el Deep Learning que se encuentra dentro del aprendizaje automático, es uno de los más aplicados para el análisis de imágenes médicas, facilitando el diagnóstico de enfermedades en los pacientes y así tomar mejores decisiones acertadas sobre su salud. El presente trabajo describe el problema para la detección de cáncer de piel a partir de imágenes ya clasificadas en melanoma maligno y benigno utilizando un modelo de aprendizaje profundo. La solución a este problema, se evaluó diferentes redes neuronales convolucionales, las cuales puedan obtener una mejor exactitud de la imagen tomada. El modelo establecido para el problema está basado en una clasificación binaria utilizando los valores 1 en caso de maligno y 0 para benigno, así se podrá detectar de forma temprana el melanoma siendo de gran utilidad. La solución propuesta es una nueva arquitectura para el entrenamiento y validación de las imágenes. El proyecto finalmente realiza una comparativa de los resultados que se han realizado en otra investigación, donde las métricas de nuestro proyecto mejoran considerablemente al tener 3 capas. Estos resultados se evaluaron utilizando repositorios de imágenes, que están validados por centros especiales de salud de cáncer de piel.Machine learning has been the most used technique in different applications, deep learning is in machine learning, is the most used technique for analysis of medical images, facilitate the diagnosis of diseases in patients and thus make better decisions Decisions successful on your health. The present work describes the problem for the detection of skin cancer and the images classified in malignant and benign melanoma in a deep learning model. The solution to this problem, different convolutional neuronal networks were evaluated, so that they to be able to best accuracy of the image taken. The model established for the problem is based on a binary classification using the values 1 in case of malignancy and 0 for more information. The solution is a new architecture for the training and validation of images. The project finally makes a comparison of the results that have been made in another investigation, where the metrics of our project improve the results of having 3 layers. These results are evaluated using image repositories, which are validated by special skin cancer health centers.TesisCampus Arequip

    Procesado de retinografías basado en Deep Learning para la ayuda al diagnóstico de la Retinopatía Diabética

    Get PDF
    La Retinopatía Diabética (RD) es una complicación de la diabetes y es la causa más frecuente de ceguera en la población laboral activa de los países desarrollados. Sin embargo, cuando se trata de forma precoz, más del 90% de la pérdida de visión se puede prevenir. Las retinografías capturadas durante exámenes oculares regulares son el método estándar para detectar RD. No obstante, el aumento de los casos de diabetes a nivel mundial y la falta de especialistas dificultan el diagnóstico. Las imágenes de fondo de ojo generalmente se obtienen usando cámaras de fondo de ojo en condiciones de luz y ángulos variados. Por lo tanto, estas imágenes son propensas a una iluminación no uniforme, contraste deficiente, bajo brillo y falta de nitidez, lo que provoca imágenes borrosas. Estas imágenes borrosas o con falta de iluminación podrían afectar el diagnóstico clínico. Por lo tanto, mejorar estas imágenes de calidad insuficiente puede ser muy útil para evitar diagnósticos erróneos en sistemas de cribado automáticos o manuales. Recientemente, el aprendizaje automático, especialmente las técnicas basadas en Deep Learning, han supuesto una revolución en el campo de la reconstrucción de imágenes. Por ello, en este trabajo, se propone un método de mejora de calidad de retinografías basado en redes de generativas antagónicas (Generative Adversarial Network, GAN). El modelo está formado por dos redes neuronales convolucionales: una red neuronal que actúa como generador de imágenes sintéticas con el objetivo de engañar a una red discriminadora que está entrenada para distinguir las imágenes generadas de alta calidad de las imágenes reales. Este modelo puede funcionar con imágenes de gran resolución, lo que lo hace ampliamente beneficioso para las imágenes clínicas. En este trabajo, la mejora de calidad de la imagen de fondo de ojo abarca una fase de corrección de la nitidez y una segunda fase de corrección de la iluminación. Para el desarrollo y validación del método propuesto, se utilizó una base de datos propia de 1000 imágenes. Dichas imágenes se dividieron en un conjunto de entrenamiento con 800 imágenes de entrenamiento y un conjunto de test con 200 imágenes, de las cuales la mitad tenían calidad insuficiente para su análisis. Sobre ellas, se aplicó un método con varias etapas. En primer lugar, se abordó la mejora de imágenes borrosas empleando una red profunda de tipo GAN. En segundo lugar, se abordó la mejora de imágenes con falta de iluminación, también a través de una red GAN. Cualitativamente, los resultados obtenidos son satisfactorios. Asimismo, se abordó la evaluación cuantitativa de los resultados desde dos perspectivas: evaluación con referencia y evaluación sin referencia. Para la evaluación sin referencia, se utilizan las métricas Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE), Natural Image Quality Evaluator (NIQE) y entropía. En cuanto a la evaluación con una imagen de referencia, se utilizaron la relación señal a ruido (Peak Signal-to-Noise Ratio, PSNR) y el índice de similitud estructural (Structural Similarity Index Measure, SSIM). La evaluación con referencia sirve como guía para comparar las imágenes de buena calidad que han sido degradadas intencionadamente. Por otra parte, la evaluación sin referencia es necesaria para evaluar la mejora que el método produce sobre imágenes de mala calidad ya que, de partida, no se dispone de una versión de buena calidad de dichas imágenes. En la fase de mejora de nitidez y sobre las imágenes de test buena calidad, los resultados obtenidos muestran una mejora del 6.22%, 3.33% y 3.26% en términos de PSNR, SSIM y entropía, respectivamente. No obstante, las medidas BRISQUE y NIQE no presentan una mejora. En esta misma etapa, pero sobre las imágenes de test mala calidad los resultados muestran un 31.80%, 4.27% y 3.89% de mejora en términos de BRISQUE, NIQE y entropía respecto a la imagen original real. Asimismo, en la fase de mejora de imágenes con falta de iluminación, los resultados sobre el conjunto de imágenes de buena calidad muestran una mejora del 156.81%, 14.59%, 3.12% y 2.28% en términos de PSNR, SSIM, BRISQUE y NIQE; mientras que la entropía no presenta una mejoría. En esta fase, y sobre el conjunto de imágenes de mala calidad los resultados reflejan una mejora del 50.62% y un 8.33% en términos de BRISQUE y entropía. Sin embargo, en este grupo de imágenes, la medida NIQE no mejora. Finalmente, se ha llevado a cabo un último experimento con ambas redes en serie. En primer lugar, las imágenes atraviesan la red que corrige la iluminación, y posteriormente se corrige su nitidez con la segunda red. Sobre las imágenes de test de buena calidad se ha conseguido un 4.84%, 5.68%, 3.38% y 2.57% de mejora respecto de la imagen original en términos de PSNR, SSIM, NIQE y entropía, aunque no se observa mejora en términos de BRISQUE. En este último experimento, y sobre las imágenes de test de mala calidad se ha obtenido un 88.95%, 21.17% y 2.46% de mejora en términos de BRISQUE, NIQE y entropía. Los resultados obtenidos muestran que el método propuesto podría ser utilizado como primera etapa dentro de sistemas automáticos de análisis de retinografías para la ayuda al diagnóstico de diversas enfermedades oculares.Diabetic Retinopathy (DR) is a complication of diabetes and the leading cause of blindness worldwide. However, when treated early, more than 90% of vision loss can be prevented. Color fundus photography has been the standard method for detecting DR. However, the growing incidence of diabetes and the lack of specialists make diagnosis difficult. Fundus images are generally obtained using fundus cameras in varied light conditions and angles. Thence, these images are prone to non-uniform illumination, poor contrast, low brightness and lack of sharpness resulting in blurry images. These blurry or poor illuminated images could affect clinical diagnosis. Therefore, improving these poor-quality images can be very helpful in avoiding misdiagnosis in automatic or manual screening systems. Recently, machine learning, especially deep learning techniques, have brought revolution to image super resolution reconstruction. For this reason, in this work, we propose a retinal fundus image enhancement method based on Generative Adversarial Networks (GAN). The model is composed of two convolutional neural networks: a neural network that acts as a generator of synthetic images with the aim of tricking a discriminating network that is trained to distinguish high-quality generated images from real images. This model can work with high resolution images, which makes it widely beneficial for clinical images. In this work, the fundus image enhancement method includes both the sharpness correction and the lighting correction. The proposed technique was evaluated in a proprietary database of 200 images, of which half were of insufficient quality. A method with several stages was applied to them. Firstly, blurry image enhancement was addressed by a GAN network. Secondly, the improvement of images with lack of lighting was addressed, also through a GAN network. To evaluate the retinal image enhancement performance, visual and quantitative evaluation were carried out. Two kinds of image quality assessment were adopted: full-reference and no-reference evaluation. For no-reference assessment, Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE), Natural Image Quality Evaluator (NIQE) and Entropy were chosen to assess each enhanced image and its original blurry retinal image. As to full-reference assessment, Signal-to-Noise Ratio (PSNR) and Structural Similarity Index Measure (SSIM) were used. SSIM and PSNR give the comparison between the enhanced image and the original image. Quantitatively, in the blurred image improvement phase using good quality images, the results obtained show that it is possible to achieve an improvement of 6.22%, 3.33% and 3.26% in terms of PSNR, SSIM and entropy. However, the BRISQUE and NIQE measures do not show an improvement. In this same stage, but on the images of poor quality, the results show a 31.80%, 4.27% and 3.89% improvement in terms of BRISQUE, NIQE and entropy with respect to the real original image. Likewise, in the improvement phase of images with lack of lighting, the results on the set of good quality images show an improvement of 156.81%, 14.59%, 3.12% and 2.28% in terms of PSNR, SSIM, BRISQUE and NIQE; while entropy does not improve. In this phase, using the set of poor-quality images, the results reflect an improvement of 50.62% and 8.33% in terms of BRISQUE and entropy. However, in this group of images the NIQE measure does not improve. Finally, a last experiment was carried out with both networks. First, the images passed through the GAN network that corrected their lighting, and then their sharpness was corrected with the second GAN network. On the good quality test images, the results obtained show an improvement of 4.84%, 5.68%, 3.38% and 2.57% in terms of PSNR, SSIM, NIQE and entropy, although the BRISQUE measure does not improve. In this last experiment, and on the poor-quality test images, the results show an improvement of 88.95%, 21.17% and 2.46% in terms of BRISQUE, NIQE and entropy. The results indicate that the proposed method could be used as a first stage in automatic retinography analysis systems to aid in the diagnosis of various eye diseases.Grado en Ingeniería de Tecnologías de Telecomunicació

    Task-based Optimization of Administered Activity for Pediatric Renal SPECT Imaging

    Get PDF
    Like any real-world problem, the design of an imaging system always requires tradeoffs. For medical imaging modalities using ionization radiation, a major tradeoff is between diagnostic image quality (IQ) and risk to the patient from absorbed dose (AD). In nuclear medicine, reducing the AD requires reducing the administered activity (AA). Lower AA to the patient can reduce risk and adverse effects, but can also result in reduced diagnostic image quality. Thus, ultimately, it is desirable to use the lowest AA that gives sufficient image quality for accurate clinical diagnosis. In this dissertation, we proposed and developed tools for a general framework for optimizing RD with task-based assessment of IQ. Here, IQ is defined as an objective measure of the user performing the diagnostic task that the images were acquired to answer. To investigate IQ as a function of renal defect detectability, we have developed a projection image database modeling imaging of 99mTc-DMSA, a renal function agent. The database uses a highly-realistic population of pediatric phantoms with anatomical and body morphological variations. Using the developed projection image database, we have explored patient factors that affect IQ and are currently in the process of determining relationships between IQ and AA in terms of these found factors. Our data have shown that factors that are more local to the target organ may be more robust than weight for estimating the AA needed to provide a constant IQ across a population of patients. In the case of renal imaging, we have discovered that girth is more robust than weight (currently used in clinical practice) in predicting AA needed to provide a desired IQ. In addition to exploring the patient factors, we also did some work on improving the task simulating capability for anthropomorphic model observer. We proposed a deep learning-based anthropomorphic model observer to fully and efficiently (in terms of both training data and computational cost) model the clinical 3D detection task using multi-slice, multi-orientation image sets. The proposed model observer is important and could be readily adapted to model human observer performance on detection tasks for other imaging modalities such as PET, CT or MRI

    Multimodal machine learning for intelligent mobility

    Get PDF
    Scientific problems are solved by finding the optimal solution for a specific task. Some problems can be solved analytically while other problems are solved using data driven methods. The use of digital technologies to improve the transportation of people and goods, which is referred to as intelligent mobility, is one of the principal beneficiaries of data driven solutions. Autonomous vehicles are at the heart of the developments that propel Intelligent Mobility. Due to the high dimensionality and complexities involved in real-world environments, it needs to become commonplace for intelligent mobility to use data-driven solutions. As it is near impossible to program decision making logic for every eventuality manually. While recent developments of data-driven solutions such as deep learning facilitate machines to learn effectively from large datasets, the application of techniques within safety-critical systems such as driverless cars remain scarce.Autonomous vehicles need to be able to make context-driven decisions autonomously in different environments in which they operate. The recent literature on driverless vehicle research is heavily focused only on road or highway environments but have discounted pedestrianized areas and indoor environments. These unstructured environments tend to have more clutter and change rapidly over time. Therefore, for intelligent mobility to make a significant impact on human life, it is vital to extend the application beyond the structured environments. To further advance intelligent mobility, researchers need to take cues from multiple sensor streams, and multiple machine learning algorithms so that decisions can be robust and reliable. Only then will machines indeed be able to operate in unstructured and dynamic environments safely. Towards addressing these limitations, this thesis investigates data driven solutions towards crucial building blocks in intelligent mobility. Specifically, the thesis investigates multimodal sensor data fusion, machine learning, multimodal deep representation learning and its application of intelligent mobility. This work demonstrates that mobile robots can use multimodal machine learning to derive driver policy and therefore make autonomous decisions.To facilitate autonomous decisions necessary to derive safe driving algorithms, we present an algorithm for free space detection and human activity recognition. Driving these decision-making algorithms are specific datasets collected throughout this study. They include the Loughborough London Autonomous Vehicle dataset, and the Loughborough London Human Activity Recognition dataset. The datasets were collected using an autonomous platform design and developed in house as part of this research activity. The proposed framework for Free-Space Detection is based on an active learning paradigm that leverages the relative uncertainty of multimodal sensor data streams (ultrasound and camera). It utilizes an online learning methodology to continuously update the learnt model whenever the vehicle experiences new environments. The proposed Free Space Detection algorithm enables an autonomous vehicle to self-learn, evolve and adapt to new environments never encountered before. The results illustrate that online learning mechanism is superior to one-off training of deep neural networks that require large datasets to generalize to unfamiliar surroundings. The thesis takes the view that human should be at the centre of any technological development related to artificial intelligence. It is imperative within the spectrum of intelligent mobility where an autonomous vehicle should be aware of what humans are doing in its vicinity. Towards improving the robustness of human activity recognition, this thesis proposes a novel algorithm that classifies point-cloud data originated from Light Detection and Ranging sensors. The proposed algorithm leverages multimodality by using the camera data to identify humans and segment the region of interest in point cloud data. The corresponding 3-dimensional data was converted to a Fisher Vector Representation before being classified by a deep Convolutional Neural Network. The proposed algorithm classifies the indoor activities performed by a human subject with an average precision of 90.3%. When compared to an alternative point cloud classifier, PointNet[1], [2], the proposed framework out preformed on all classes. The developed autonomous testbed for data collection and algorithm validation, as well as the multimodal data-driven solutions for driverless cars, is the major contributions of this thesis. It is anticipated that these results and the testbed will have significant implications on the future of intelligent mobility by amplifying the developments of intelligent driverless vehicles.</div
    corecore