7 research outputs found

    Dilated FCN: Listening Longer to Hear Better

    Full text link
    Deep neural network solutions have emerged as a new and powerful paradigm for speech enhancement (SE). The capabilities to capture long context and extract multi-scale patterns are crucial to design effective SE networks. Such capabilities, however, are often in conflict with the goal of maintaining compact networks to ensure good system generalization. In this paper, we explore dilation operations and apply them to fully convolutional networks (FCNs) to address this issue. Dilations equip the networks with greatly expanded receptive fields, without increasing the number of parameters. Different strategies to fuse multi-scale dilations, as well as to install the dilation modules are explored in this work. Using Noisy VCTK and AzBio sentences datasets, we demonstrate that the proposed dilation models significantly improve over the baseline FCN and outperform the state-of-the-art SE solutions.Comment: 5 pages; will appear in WASPAA conferenc

    Medical Image Segmentation Review: The success of U-Net

    Full text link
    Automatic medical image segmentation is a crucial topic in the medical domain and successively a critical counterpart in the computer-aided diagnosis paradigm. U-Net is the most widespread image segmentation architecture due to its flexibility, optimized modular design, and success in all medical image modalities. Over the years, the U-Net model achieved tremendous attention from academic and industrial researchers. Several extensions of this network have been proposed to address the scale and complexity created by medical tasks. Addressing the deficiency of the naive U-Net model is the foremost step for vendors to utilize the proper U-Net variant model for their business. Having a compendium of different variants in one place makes it easier for builders to identify the relevant research. Also, for ML researchers it will help them understand the challenges of the biological tasks that challenge the model. To address this, we discuss the practical aspects of the U-Net model and suggest a taxonomy to categorize each network variant. Moreover, to measure the performance of these strategies in a clinical application, we propose fair evaluations of some unique and famous designs on well-known datasets. We provide a comprehensive implementation library with trained models for future research. In addition, for ease of future studies, we created an online list of U-Net papers with their possible official implementation. All information is gathered in https://github.com/NITR098/Awesome-U-Net repository.Comment: Submitted to the IEEE Transactions on Pattern Analysis and Machine Intelligence Journa

    Fuzzy Logic with Deep Learning for Detection of Skin Cancer

    Get PDF
    Melanoma is the deadliest type of cancerous cell, which is developed when melanocytes, melanin producing cell, starts its uncontrolled growth. If not detected and cured in its situ, it might decrease the chances of survival of patients. The diagnosis of a melanoma lesion is still a challenging task due to its visual similarities with benign lesions. In this paper, a fuzzy logic-based image segmentation along with a modified deep learning model is proposed for skin cancer detection. The highlight of the paper is its dermoscopic image enhancement using pre-processing techniques, infusion of mathematical logics, standard deviation methods, and the L-R fuzzy defuzzification method to enhance the results of segmentation. These pre-processing steps are developed to improve the visibility of lesion by removing artefacts such as hair follicles, dermoscopic scales, etc. Thereafter, the image is enhanced by histogram equalization method, and it is segmented by proposed method prior to performing the detection phase. The modified model employs a deep neural network algorithm, You Look Only Once (YOLO), which is established on the application of Deep convolutional neural network (DCNN) for detection of melanoma lesion from digital and dermoscopic lesion images. The YOLO model is composed of a series of DCNN layers we have added more depth by adding convolutional layer and residual connections. Moreover, we have introduced feature concatenation at different layers which combines multi-scale features. Our experimental results confirm that YOLO provides a better accuracy score and is faster than most of the pre-existing classifiers. The classifier is trained with 2000 and 8695 dermoscopic images from ISIC 2017 and ISIC 2018 datasets, whereas PH2 datasets along with both the previously mentioned datasets are used for testing the proposed algorithm

    Diagnosis of skin cancer using novel computer vision and deep learning techniques

    Get PDF
    Recent years have noticed an increase in the total number of skin cancer cases and it is projected to grow exponentially, however mortality rate of malignant melanoma can be decreased if it is diagnosed and treated in its early stage. Notwithstanding the fact that visual similarity between benign and malignant lesions makes the task of diagnosis difficult even for an expert dermatologist, thereby increasing the chances of false prediction. This dissertation proposes two novel methods of computer-aided diagnosis for the classification of malignant lesion. The first method pre-processes the acquired image by the Dull razor method (for digital hair removal) and histogram equalisation. Henceforth the image is segmented by the proposed method using LR-fuzzy logic and it achieves an accuracy, sensitivity and specificity of 96.50%, 97.50% and 96.25% for the PH2 dataset; 96.16%, 91.88% and 98.26% for the ISIC 2017 dataset; 95.91%, 91.62% and 97.37% for ISIC 2018 dataset respectively. Furthermore, the image is classified by the modified You Only Look Once (YOLO v3) classifier and it yields an accuracy, sensitivity and specificity of 98.16%, 95.43%, and 99.50% respectively. The second method enhances the images by removing digital artefacts and histogram equalisation. Thereafter, triangular neutrosophic number (TNN) is used for segmentation of lesion, which achieves an accuracy, sensitivity, and specificity of 99.00%, 97.50%, 99.38% for PH2; 98.83%, 98.48%, 99.01% for ISIC 2017; 98.56%, 98.50%, 98.58% for ISIC 2018; and 97.86%, 97.56%, 97.97% for ISIC 2019 dataset respectively. Furthermore, data augmentation is performed by the addition of artefacts and noise to the training dataset and rotating the images at an angle of 650, 1350, and 2150 such that the training dataset is increased to 92838 from 30946 images. Additionally, a novel classifier based on inception and residual module is trained over augmented dataset and it is able to achieve an accuracy, sensitivity and specificity of 99.50%, 100%, 99.38% for PH2; 99.33%, 98.48%, 99.75% for ISIC 2017; 98.56%, 97.61%, 98.88% for ISIC 2018 and 98.04%, 96.67%, 98.52% for ISIC 2019 dataset respectively. Later in our dissertation, the proposed methods are deployed into real-time mobile applications, therefore enabling the users to diagnose the suspected lesion with ease and accuracy

    Skin Cancer Diagnosis Based on Neutrosophic Features with a Deep Neural Network.

    Get PDF
    Recent years evidenced an increase in the total number of skin cancer cases, and it is projected to grow exponentially. This paper proposes a computer-aided diagnosis system for the classification of a malignant lesion, where the acquired image is primarily pre-processed using novel methods. Digital artifacts such as hair follicles and blood vessels are removed, and thereafter, the image is enhanced using a novel method of histogram equalization. Henceforth, the pre-processed image undergoes the segmentation phase, where the suspected lesion is segmented using the Neutrosophic technique. The segmentation method employs a thresholding-based method along with a pentagonal neutrosophic structure to form a segmentation mask of the suspected skin lesion. The paper proposes a deep neural network base on Inception and residual blocks with softmax block after each residual block which makes the layer wider and easier to learn the key features more quickly. The proposed classifier was trained, tested, and validated over PH2, ISIC 2017, ISIC 2018, and ISIC 2019 datasets. The proposed segmentation model yields an accuracy mark of 99.50%, 99.33%, 98.56% and 98.04% for these datasets, respectively. These datasets are augmented to form a total of 103,554 images for training, which make the classifier produce enhanced classification results. Our experimental results confirm that the proposed classifier yields an accuracy score of 99.50%, 99.33%, 98.56%, and 98.04% for PH2, ISIC 2017, 2018, and 2019, respectively, which is better than most of the pre-existing classifiers
    corecore