73 research outputs found

    PKCAM: Previous Knowledge Channel Attention Module

    Full text link
    Recently, attention mechanisms have been explored with ConvNets, both across the spatial and channel dimensions. However, from our knowledge, all the existing methods devote the attention modules to capture local interactions from a uni-scale. In this paper, we propose a Previous Knowledge Channel Attention Module(PKCAM), that captures channel-wise relations across different layers to model the global context. Our proposed module PKCAM is easily integrated into any feed-forward CNN architectures and trained in an end-to-end fashion with a negligible footprint due to its lightweight property. We validate our novel architecture through extensive experiments on image classification and object detection tasks with different backbones. Our experiments show consistent improvements in performances against their counterparts. Our code is published at https://github.com/eslambakr/EMCA

    Support Vector Machine (SVM) Recognition Approach adapted to Individual and Touching Moths Counting in Trap Images

    Get PDF
    This paper aims at developing an automatic algorithm for moth recognition from trap images in real-world conditions. This method uses our previous work for detection [1] and introduces an adapted classification step. More precisely, SVM classifier is trained with a multi-scale descriptor, Histogram Of Curviness Saliency (HCS). This descriptor is robust to illumination changes and is able to detect and to describe the external and the internal contours of the target insect in multi-scale. The proposed classification method can be trained with a small set of images. Quantitative evaluations show that the proposed method is able to classify insects with higher accuracy (rate of 95.8%) than the state-of-the art approaches

    FGR-Net: interpretable fundus image gradeability classification based on deep reconstruction learning

    Get PDF
    The performance of diagnostic Computer-Aided Design (CAD) systems for retinal diseases depends on the quality of the retinal images being screened. Thus, many studies have been developed to evaluate and assess the quality of such retinal images. However, most of them did not investigate the relationship between the accuracy of the developed models and the quality of the visualization of interpretability methods for distinguishing between gradable and non-gradable retinal images. Consequently, this paper presents a novel framework called ‘‘FGR-Net’’ to automatically assess and interpret underlying fundus image quality by merging an autoencoder network with a classifier network. The FGR-Net model also provides an interpretable quality assessment through visualizations. In particular, FGR-Net uses a deep autoencoder to reconstruct the input image in order to extract the visual characteristics of the input fundus images based on self-supervised learning. The extracted features by the autoencoder are then fed into a deep classifier network to distinguish between gradable and ungradable fundus images. FGR-Net is evaluated with different interpretability methods, which indicates that the autoencoder is a key factor in forcing the classifier to focus on the relevant structures of the fundus images, such as the fovea, optic disk, and prominent blood vessels. Additionally, the interpretability methods can provide visual feedback for ophthalmologists to understand how our model evaluates the quality of fundus images. The experimental results showed the superiority of FGR-Net over the state-of-the-art quality assessment methods, with an accuracy of > 89% and an F1-score of > 87%. The code is publicly available at https://github.com/saifalkh/FGR-Net.Instituto de Investigación en Informátic

    Development of A New Local Mineral Admixture for Enhancing Concrete properties

    Get PDF
    Proceeding from the saying of our God almighty on his book, the holy Qur'an: "Then ignite for me, O Hāmān, (a fire) upon the clay (From which bricks are made) and make for me a tower....". Therefore, this paper presents an investigation on, using calcined ball-clay (CBC) as mineral pozzolanic admixture for concrete production. CBC is obtained from calcination processes for local ball-clay at specified conditions. To evaluate ball-clay calcination process, various temperatures (600–900 ºC) and burning durations (2, 3 and 4 hours) are used and the optimum temperature and burning time for calcination are assessed by strength activity index at age of 28 days. The hardened properties development of concrete mixtures containing 0%, 10%, 15% and 20% CBC as cement partial replacement are analysed in terms of compressive strength at 7, 28, 90 and 180 days, water absorption, ultra-sonic pulse velocity and electrical resistivity. In addition, microstructure by XRD of the cement pastes incorporating CBC was studied. The results showed that the optimum calcination process to obtain CBC are carried out at temperature 800 °C for 4 hours. The replacement of cement by 10% of CBC is an optimal dosage for concrete mixtures since it achieved an increase of compressive strength by 28% as compared with control one. Therefore, adding CBC can lead to a beneficial utilization of natural local resources, which reduces energy consumption and minimizes CO2 footprint during the manufacturing of cement concrete, thus, concrete can become an eco-friendly and sustainable material

    Breast tumor segmentation in ultrasound images using contextual-information-aware deep adversarial learning framework.

    Get PDF
    Automatic tumor segmentation in breast ultrasound (BUS) images is still a challenging task because of many sources of uncertainty, such as speckle noise, very low signal-to-noise ratio, shadows that make the anatomical boundaries of tumors ambiguous, as well as the highly variable tumor sizes and shapes. This article proposes an efficient automated method for tumor segmentation in BUS images based on a contextual information-aware conditional generative adversarial learning framework. Specifically, we exploit several enhancements on a deep adversarial learning framework to capture both texture features and contextual dependencies in the BUS images that facilitate beating the challenges mentioned above. First, we adopt atrous convolution (AC) to capture spatial and scale context (i.e., position and size of tumors) to handle very different tumor sizes and shapes. Second, we propose the use of channel attention along with channel weighting (CAW) mechanisms to promote the tumor-relevant features (without extra supervision) and mitigate the effects of artifacts. Third, we propose to integrate the structural similarity index metric (SSIM) and L1-norm in the loss function of the adversarial learning framework to capture the local context information derived from the area surrounding the tumors. We used two BUS image datasets to assess the efficiency of the proposed model. The experimental results show that the proposed model achieves competitive results compared with state-of-the-art segmentation models in terms of Dice and IoU metrics. The source code of the proposed model is publicly available at https://github.com/vivek231/Breast-US-project

    Food places classification in egocentric images using Siamese neural networks.

    Get PDF
    Wearable cameras have become more popular in recent years for capturing unscripted moments in the first-person, which help in analysis of the user's lifestyle. In this work, we aim to identify the daily food patterns of a person through recognition of places relating to food in person-focused images ("selfies"). This has the potential for a system that can assist with improvements to eating habits and prevention of diet-related conditions. In this paper, we use Siamese Neural Networks (SNN) to learn similarities between images with one-shot "food places" classification. We tested our proposed method with "MiniEgoFoodPlaces", using 15 food-related locations. The proposed SNN model with MobileNet achieved an overall classification accuracy of 76.74% and 77.53% on the validation and test sets of the "MiniEgoFoodPlaces" dataset, outperforming the base models such as ResNet50, InceptionV3 and InceptionResNetV2

    FGR-Net: interpretable fundus image gradeability classification based on deep reconstruction learning

    Get PDF
    The performance of diagnostic Computer-Aided Design (CAD) systems for retinal diseases depends on the quality of the retinal images being screened. Thus, many studies have been developed to evaluate and assess the quality of such retinal images. However, most of them did not investigate the relationship between the accuracy of the developed models and the quality of the visualization of interpretability methods for distinguishing between gradable and non-gradable retinal images. Consequently, this paper presents a novel framework called ‘‘FGR-Net’’ to automatically assess and interpret underlying fundus image quality by merging an autoencoder network with a classifier network. The FGR-Net model also provides an interpretable quality assessment through visualizations. In particular, FGR-Net uses a deep autoencoder to reconstruct the input image in order to extract the visual characteristics of the input fundus images based on self-supervised learning. The extracted features by the autoencoder are then fed into a deep classifier network to distinguish between gradable and ungradable fundus images. FGR-Net is evaluated with different interpretability methods, which indicates that the autoencoder is a key factor in forcing the classifier to focus on the relevant structures of the fundus images, such as the fovea, optic disk, and prominent blood vessels. Additionally, the interpretability methods can provide visual feedback for ophthalmologists to understand how our model evaluates the quality of fundus images. The experimental results showed the superiority of FGR-Net over the state-of-the-art quality assessment methods, with an accuracy of > 89% and an F1-score of > 87%. The code is publicly available at https://github.com/saifalkh/FGR-Net.Instituto de Investigación en Informátic

    SLSNet: Skin lesion segmentation using a lightweight generativeadversarial network

    Get PDF
    The determination of precise skin lesion boundaries in dermoscopic images using automated methods faces many challenges, most importantly, the presence of hair, inconspicuous lesion edges and low contrast in dermoscopic images, and variability in the color, texture and shapes of skin lesions. Existing deep learning-based skin lesion segmentation algorithms are expensive in terms of computational time and memory. Consequently, running such segmentation algorithms requires a powerful GPU and high bandwidth memory, which are not available in dermoscopy devices. Thus, this article aims to achieve precise skin lesion segmentation with minimum resources: a lightweight, efficient generative adversarial network (GAN) model called SLSNet, which combines 1-D kernel factorized networks, position and channel attention, and multiscale aggregation mechanisms with a GAN model. The 1-D kernel factorized network reduces the computational cost of 2D filtering. The position and channel attention modules enhance the discriminative ability between the lesion and non-lesion feature representations in spatial and channel dimensions, respectively. A multiscale block is also used to aggregate the coarse-to-fine features of input skin images and reduce the effect of the artifacts. SLSNet is evaluated on two publicly available datasets: ISBI 2017 and the ISIC 2018. Although SLSNet has only 2.35 million parameters, the experimental results demonstrate that it achieves segmentation results on a par with the state-of-the-art skin lesion segmentation methods with an accuracy of 97.61%, and Dice and Jaccard similarity coefficients of 90.63% and 81.98%, respectively. SLSNet can run at more than 110 frames per second (FPS) in a single GTX1080Ti GPU, which is faster than well-known deep learning-based image segmentation models, such as FCN. Therefore, SLSNet can be used for practical dermoscopic applications
    corecore