25 research outputs found

    Measuring (in)variances in Convolutional Networks

    Get PDF
    Convolutional neural networks (CNN) offer state-of-the-art performance in various computer vision tasks such as activity recognition, face detection, medical image analysis, among others. Many of those tasks need invariance to image transformations (i.e.. rotations, translations or scaling). This work proposes a versatile, straightforward and interpretable measure to quantify the (in)variance of CNN activations with respect to transformations of the input. Intermediate output values of feature maps and fully connected layers are also analyzed with respect to different input transformations. The technique is applicable to any type of neural network and/or transformation. Our technique is validated on rotation transformations and compared with the relative (in)variance of several networks. More specifically, ResNet, AllConvolutional and VGG architectures were trained on CIFAR10 and MNIST databases with and without rotational data augmentation. Experiments reveal that rotation (in)variance of CNN outputs is class conditional. A distribution analysis also shows that lower layers are the most invariant, which seems to go against previous guidelines that recommend placing invariances near the network output and equivariances near the input.Instituto de Investigación en Informátic

    Measuring (in)variances in Convolutional Networks

    Get PDF
    Convolutional neural networks (CNN) offer state-of-the-art performance in various computer vision tasks such as activity recognition, face detection, medical image analysis, among others. Many of those tasks need invariance to image transformations (i.e.. rotations, translations or scaling). This work proposes a versatile, straightforward and interpretable measure to quantify the (in)variance of CNN activations with respect to transformations of the input. Intermediate output values of feature maps and fully connected layers are also analyzed with respect to different input transformations. The technique is applicable to any type of neural network and/or transformation. Our technique is validated on rotation transformations and compared with the relative (in)variance of several networks. More specifically, ResNet, AllConvolutional and VGG architectures were trained on CIFAR10 and MNIST databases with and without rotational data augmentation. Experiments reveal that rotation (in)variance of CNN outputs is class conditional. A distribution analysis also shows that lower layers are the most invariant, which seems to go against previous guidelines that recommend placing invariances near the network output and equivariances near the input.Instituto de Investigación en Informátic

    Invariance Measures for Neural Networks

    Full text link
    Invariances in neural networks are useful and necessary for many tasks. However, the representation of the invariance of most neural network models has not been characterized. We propose measures to quantify the invariance of neural networks in terms of their internal representation. The measures are efficient and interpretable, and can be applied to any neural network model. They are also more sensitive to invariance than previously defined measures. We validate the measures and their properties in the domain of affine transformations and the CIFAR10 and MNIST datasets, including their stability and interpretability. Using the measures, we perform a first analysis of CNN models and show that their internal invariance is remarkably stable to random weight initializations, but not to changes in dataset or transformation. We believe the measures will enable new avenues of research in invariance representation

    Measuring (in)variances in Convolutional Networks

    Get PDF
    Convolutional neural networks (CNN) offer state-of-the-art performance in various computer vision tasks such as activity recognition, face detection, medical image analysis, among others. Many of those tasks need invariance to image transformations (i.e.. rotations, translations or scaling). This work proposes a versatile, straightforward and interpretable measure to quantify the (in)variance of CNN activations with respect to transformations of the input. Intermediate output values of feature maps and fully connected layers are also analyzed with respect to different input transformations. The technique is applicable to any type of neural network and/or transformation. Our technique is validated on rotation transformations and compared with the relative (in)variance of several networks. More specifically, ResNet, AllConvolutional and VGG architectures were trained on CIFAR10 and MNIST databases with and without rotational data augmentation. Experiments reveal that rotation (in)variance of CNN outputs is class conditional. A distribution analysis also shows that lower layers are the most invariant, which seems to go against previous guidelines that recommend placing invariances near the network output and equivariances near the input.Instituto de Investigación en Informátic

    A color fusion model based on Markowitz portfolio optimization for optic disc segmentation in retinal images

    Get PDF
    Retinal disorders are a severe health threat for older adults because they may lead to vision loss and blindness. Diabetic patients are particularly prone to suffer from Diabetic Retinopathy. Identifying relevant structural components in color fundus images like the optic disc (OD) is crucial to diagnose retinal diseases. Automatic OD detection is complex because of its location in an area where blood vessels converge, and color distribution is uneven. Several image processing techniques have been developed for OD detection so far, but vessel segmentation is sometimes required, increasing computational complexity and time. Moreover, precise OD segmentation methods utilize complex algorithms that need special hardware or extensive labeled datasets. We propose an OD detection approach based on the Modern Portfolio Theory of Markowitz to generate an innovative color fusion model. Specifically, the training phase calculates the optimal weights for each color channel. A fusion of weighted color channels is then applied in the testing phase. This approach acts as a powerful and real-time preprocessing stage. We use four heterogeneous datasets to validate the presented methodology. Three out of four datasets are publicly available (i.e., DRIVE, Messidor, and HRF), and the last corresponds to an in–house dataset acquired from Hospital Universitari Sant Joan de Reus (Spain). Two different segmentation methods are presented and compared with state-of-the-art computer vision techniques to analyze the model performance. An outstanding accuracy and overlap above 0.9 and 80%, respectively, and a minimal execution time of 0.05 s are reached. Therefore, our model could be integrated into daily clinical practice to accelerate the diagnosis of Diabetic Retinopathy due to its simplicity, performance, and speed

    Convexity shape constraints for retinal blood vessel segmentation and foveal avascular zone detection

    Get PDF
    Diabetic retinopathy (DR) has become a major worldwide health problem due to the increase in blindness among diabetics at early ages. The detection of DR pathologies such as microaneurysms, hemorrhages and exudates through advanced computational techniques is of utmost importance in patient health care. New computer vision techniques are needed to improve upon traditional screening of color fundus images. The segmentation of the entire anatomical structure of the retina is a crucial phase in detecting these pathologies. This work proposes a novel framework for fast and fully automatic blood vessel segmentation and fovea detection. The preprocessing method involved both contrast limited adaptive histogram equalization and the brightness preserving dynamic fuzzy histogram equalization algorithms to enhance image contrast and eliminate noise artifacts. Afterwards, the color spaces and their intrinsic components were examined to identify the most suitable color model to reveal the foreground pixels against the entire background. Several samples were then collected and used by the renowned convexity shape prior segmentation algorithm. The proposed methodology achieved an average vasculature segmentation accuracy exceeding 96%, 95%, 98% and 94% for the DRIVE, STARE, HRF and Messidor publicly available datasets, respectively. An additional validation step reached an average accuracy of 94.30% using an in-house dataset provided by the Hospital Sant Joan of Reus (Spain). Moreover, an outstanding detection accuracy of over 98% was achieved for the foveal avascular zone. An extensive state-of-the-art comparison was also conducted. The proposed approach can thus be integrated into daily clinical practice to assist medical experts in the diagnosis of DR

    Validation of an autonomous artificial intelligence-based diagnostic system for holistic maculopathy screening in a routine occupational health checkup context

    Get PDF
    Acord transformatiu CRUE-CSICPurpose: This study aims to evaluate the ability of an autonomous artificial intelligence (AI) system for detection of the most common central retinal pathologies in fundus photography. Methods: Retrospective diagnostic test evaluation on a raw dataset of 5918 images (2839 individuals) evaluated with non-mydriatic cameras during routine occupational health checkups. Three camera models were employed: Optomed Aurora (field of view - FOV 50º, 88% of the dataset), ZEISS VISUSCOUT 100 (FOV 40º, 9%), and Optomed SmartScope M5 (FOV 40º, 3%). Image acquisition took 2 min per patient. Ground truth for each image of the dataset was determined by 2 masked retina specialists, and disagreements were resolved by a 3rd retina specialist. The specific pathologies considered for evaluation were "diabetic retinopathy" (DR), "Age-related macular degeneration" (AMD), "glaucomatous optic neuropathy" (GON), and "Nevus." Images with maculopathy signs that did not match the described taxonomy were classified as "Other." Results: The combination of algorithms to detect any abnormalities had an area under the curve (AUC) of 0.963 with a sensitivity of 92.9% and a specificity of 86.8%. The algorithms individually obtained are as follows: AMD AUC 0.980 (sensitivity 93.8%; specificity 95.7%), DR AUC 0.950 (sensitivity 81.1%; specificity 94.8%), GON AUC 0.889 (sensitivity 53.6% specificity 95.7%), Nevus AUC 0.931 (sensitivity 86.7%; specificity 90.7%). Conclusion: Our holistic AI approach reaches high diagnostic accuracy at simultaneous detection of DR, AMD, and Nevus. The integration of pathology-specific algorithms permits higher sensitivities with minimal impact on its specificity. It also reduces the risk of missing incidental findings. Deep learning may facilitate wider screenings of eye diseases

    A first glance to the quality assessment of dental photostimulable phosphor plates with deep learning

    Get PDF
    Photostimulable Phosphor Plates are commonly used in digital X-ray imaging for dentistry. During its usage, these plates get damaged, influencing the diagnosis performance and confidence of the dentistry professional. We propose a deep learning based classifier to discard or extend the use of photostimulable phosphor (PSP) plates based on their physical damage. The system automatically assesses, for the first time in the literature, when dentists should discard their plates. To validate our methodology, an in-house dataset is built on 25 PSP artifact masks (Carestream, CS 7600) digitally superimposed over 100 Complementary Metal-oxide-semiconductor (CMOS) periapical images (Carestream, RVG 6200) with known radiologic interpretations. From these 2500 images, unique subsets of 100 images were evaluated by 25 dentists to find periapical inflammatory lesions on the tooth. Doctors’ opinion on whether the plates should be discarded or not was also collected. State-of-the-art deep convolutional networks were tested using fivefold cross validation, yielding classification accuracies from 87% to almost 89%. Specifically, InceptionV3 and Resnet50 obtained the best performances with statistical significance. Qualitative heat-maps showed that such models can identify and employ artifacts to decide on whether to discard the PSP plate or not. This work intends to be the base line for future works related to the automatic PSP plate assessment

    Generalisability of deep learning models in low-resource imaging settings: A fetal ultrasound study in 5 African countries

    Get PDF
    Most artificial intelligence (AI) research and innovations have concentrated in high-income countries, where imaging data, IT infrastructures and clinical expertise are plentiful. However, slower progress has been made in limited-resource environments where medical imaging is needed. For example, in Sub-Saharan Africa the rate of perinatal mortality is very high due to limited access to antenatal screening. In these countries, AI models could be implemented to help clinicians acquire fetal ultrasound planes for diagnosis of fetal abnormalities. So far, deep learning models have been proposed to identify standard fetal planes, but there is no evidence of their ability to generalise in centres with low resources, i.e. with limited access to high-end ultrasound equipment and ultrasound data. This work investigates for the first time different strategies to reduce the domain-shift effect arisen from a fetal plane classification model trained on one clinical centre with high-resource settings and transferred to a new centre with low-resource settings. To that end, a classifier trained with 1,792 patients from Spain is first evaluated on a new centre in Denmark in optimal conditions with 1,008 patients and is later optimised to reach the same performance in five African centres (Egypt, Algeria, Uganda, Ghana and Malawi) with 25 patients each. The results show that a transfer learning approach for domain adaptation can be a solution to integrate small-size African samples with existing large-scale databases in developed countries. In particular, the model can be re-aligned and optimised to boost the performance on African populations by increasing the recall to 0.92±0.04 and at the same time maintaining a high precision across centres. This framework shows promise for building new AI models generalisable across clinical centres with limited data acquired in challenging and heterogeneous conditions and calls for further research to develop new solutions for usability of AI in countries with less resources and, consequently, in higher need of clinical support
    corecore