254 research outputs found

    The Effectiveness of Data Augmentation for Detection of Gastrointestinal Diseases from Endoscopical Images

    Full text link
    The lack, due to privacy concerns, of large public databases of medical pathologies is a well-known and major problem, substantially hindering the application of deep learning techniques in this field. In this article, we investigate the possibility to supply to the deficiency in the number of data by means of data augmentation techniques, working on the recent Kvasir dataset of endoscopical images of gastrointestinal diseases. The dataset comprises 4,000 colored images labeled and verified by medical endoscopists, covering a few common pathologies at different anatomical landmarks: Z-line, pylorus and cecum. We show how the application of data augmentation techniques allows to achieve sensible improvements of the classification with respect to previous approaches, both in terms of precision and recall

    Multiclassification of license plate based on deep convolution neural networks

    Get PDF
    In the classification of license plate there are some challenges such that the different sizes of plate numbers, the plates' background, and the number of the dataset of the plates. In this paper, a multiclass classification model established using deep convolutional neural network (CNN) to classify the license plate for three countries (Armenia, Belarus, Hungary) with the dataset of 600 images as 200 images for each class (160 for training and 40 for validation sets). Because of the small numbers of datasets, a preprocessing on the dataset is performed using pixel normalization and image data augmentation techniques (rotation, horizontal flip, zoom range) to increase the number of datasets. After that, we feed the augmented images into the convolution layer model, which consists of four blocks of convolution layer. For calculating and optimizing the efficiency of the classification model, a categorical cross-entropy and Adam optimizer used with a learning rate was 0.0001. The model's performance showed 99.17% and 97.50% of the training and validation sets accuracies sequentially, with total accuracy of classification is 96.66%. The time of training is lasting for 12 minutes. An anaconda python 3.7 and Keras Tensor flow backend are used

    Towards real-world clinical colonoscopy deep learning models for video-based bowel preparation and generalisable polyp segmentation

    Get PDF
    Colorectal cancer is the most prevalence type of cancers within the digestive system. Early screening and removal of precancerous growths in the colon decrease mortality rate. The golden standard screening type for colon is colonoscopy which is conducted by a medical expert (i.e., colonoscopist). Nevertheless, due to human biases, fatigue, and experience level of the colonoscopist, colorectal cancer missing rate is negatively affected. Artificial intelligence (AI) methods hold immense promise not just in automating colonoscopy tasks but also enhancing the performance of colonoscopy screening in general. The recent development of intense computational GPUs enabled a computational-demanding AI method (i.e., deep learning) to be utilised in various medical applications. However, given the gap between the clinical-practice and the proposed deep learning models in the literature, the actual effectiveness of such methods is questionable. Hence, this thesis highlights such gaps that arises from the separation between the theoretical and practical aspect of deep learning methods applied to colonoscopy. The aim is to evaluate the current state of deep learning models applied in colonoscopy from a clinical angle, and accordingly propose better evaluation strategies and deep learning models. The aim is translated into three distinct objectives. The first objective is to develop a systematic evaluation method to assess deep learning models from a clinical perspective. The second objective is to develop a novel deep learning architecture that leverages spatial information within colonoscopy videos to enhance the effectiveness of deep learning models on real-clinical environments. The third objective is to enhance the generalisability of deep learning models on unseen test images by developing a novel deep learning framework. To translate these objectives into practice, two critical colonoscopy tasks, namely, automatic bowel preparation and polyp segmentation are attacked. In both tasks, subtle overestimations are found in the literature and discussed in the thesis theoretically and demonstrated empirically. These overestimations are induced by improper validation sets that would not appear or represent the real-world clinical environment. Arbitrary dividing colonoscopy datasets to do deep learning evaluation can result in producing similar distributions, hence, achieving unrealistic results. Accordingly, these factors are considered in the thesis to avoid such subtle overestimation. For the automatic bowel preparation task, colonoscopy videos that closely resemble clinical settings are considered as input and accordingly it necessitates the design of the proposed model as well as evaluation experiments. The proposed model’s architecture is designed to utilise both temporal and spatial information within colonoscopy videos using Gated Recurrent Unit (GRU) and a proposed Multiplexer unit, respectively. Meanwhile for the polyp segmentation task, the efficiency of current deep learning models is tested in terms of their generalisation capabilities using unseen test sets from different medical centres. The proposed framework consists of two connected models. The first model is responsible for gradually transforming textures of input images and arbitrary change their colours. Meanwhile the second model is a segmentation model that outlines polyp regions. Exposing the segmentation model to such transformed images acquires the segmentation model texture/colour invariant properties, hence, enhances the generalisability of the segmentation model. In this thesis, rigorous experiments are conducted to evaluate the proposed models against the state-of-the-art models. The yielded results indicate that the proposed models outperformed the state-of-the-art models under different settings

    Towards real-world clinical colonoscopy deep learning models for video-based bowel preparation and generalisable polyp segmentation

    Get PDF
    Colorectal cancer is the most prevalence type of cancers within the digestive system. Early screening and removal of precancerous growths in the colon decrease mortality rate. The golden standard screening type for colon is colonoscopy which is conducted by a medical expert (i.e., colonoscopist). Nevertheless, due to human biases, fatigue, and experience level of the colonoscopist, colorectal cancer missing rate is negatively affected. Artificial intelligence (AI) methods hold immense promise not just in automating colonoscopy tasks but also enhancing the performance of colonoscopy screening in general. The recent development of intense computational GPUs enabled a computational-demanding AI method (i.e., deep learning) to be utilised in various medical applications. However, given the gap between the clinical-practice and the proposed deep learning models in the literature, the actual effectiveness of such methods is questionable. Hence, this thesis highlights such gaps that arises from the separation between the theoretical and practical aspect of deep learning methods applied to colonoscopy. The aim is to evaluate the current state of deep learning models applied in colonoscopy from a clinical angle, and accordingly propose better evaluation strategies and deep learning models. The aim is translated into three distinct objectives. The first objective is to develop a systematic evaluation method to assess deep learning models from a clinical perspective. The second objective is to develop a novel deep learning architecture that leverages spatial information within colonoscopy videos to enhance the effectiveness of deep learning models on real-clinical environments. The third objective is to enhance the generalisability of deep learning models on unseen test images by developing a novel deep learning framework. To translate these objectives into practice, two critical colonoscopy tasks, namely, automatic bowel preparation and polyp segmentation are attacked. In both tasks, subtle overestimations are found in the literature and discussed in the thesis theoretically and demonstrated empirically. These overestimations are induced by improper validation sets that would not appear or represent the real-world clinical environment. Arbitrary dividing colonoscopy datasets to do deep learning evaluation can result in producing similar distributions, hence, achieving unrealistic results. Accordingly, these factors are considered in the thesis to avoid such subtle overestimation. For the automatic bowel preparation task, colonoscopy videos that closely resemble clinical settings are considered as input and accordingly it necessitates the design of the proposed model as well as evaluation experiments. The proposed model’s architecture is designed to utilise both temporal and spatial information within colonoscopy videos using Gated Recurrent Unit (GRU) and a proposed Multiplexer unit, respectively. Meanwhile for the polyp segmentation task, the efficiency of current deep learning models is tested in terms of their generalisation capabilities using unseen test sets from different medical centres. The proposed framework consists of two connected models. The first model is responsible for gradually transforming textures of input images and arbitrary change their colours. Meanwhile the second model is a segmentation model that outlines polyp regions. Exposing the segmentation model to such transformed images acquires the segmentation model texture/colour invariant properties, hence, enhances the generalisability of the segmentation model. In this thesis, rigorous experiments are conducted to evaluate the proposed models against the state-of-the-art models. The yielded results indicate that the proposed models outperformed the state-of-the-art models under different settings

    Surface loss for medical image segmentation

    Get PDF
    Last decades have witnessed an unprecedented expansion of medical data in various largescale and complex systems. While achieving a lot of successes in many complex medical problems, there are still some challenges to deal with. Class imbalance is one of the common problems of medical image segmentation. It occurs mostly when there is a severely unequal class distribution, for instance, when the size of target foreground region is several orders of magnitude less that the background region size. In such problems, typical loss functions used for convolutional neural networks (CNN) segmentation fail to deliver good performances. Widely used losses,e.g., Dice or cross-entropy, are based on regional terms. They assume that all classes are equally distributed. Thus, they tend to favor the majority class and misclassify the target class. To address this issue, the main objective of this work is to build a boundary loss, a distance based measure on the space of contours and not regions. We argue that a boundary loss can mitigate the problems of regional losses via introducing a complementary distance-based information. Our loss is inspired by discrete (graph-based) optimization techniques for computing gradient flows of curve evolution. Following an integral approach for computing boundary variations, we express a non-symmetric L2 distance on the space of shapes as a regional integral, which avoids completely local differential computations. Our boundary loss is the sum of linear functions of the regional softmax probability outputs of the network. Therefore, it can easily be combined with standard regional losses and implemented with any existing deep network architecture for N-dimensional segmentation (N-D). Experiments were carried on three benchmark datasets corresponding to increasingly unbalanced segmentation problems: Multi modal brain tumor segmentation (BRATS17), the ischemic stroke lesion (ISLES) and white matter hyperintensities (WMH). Used in conjunction with the region-based generalized Dice loss (GDL), our boundary loss improves performance significantly compared to GDL alone, reaching up to 8% improvement in Dice score and 10% improvement in Hausdorff score. It also yielded a more stable learning process

    Text-detection and -recognition from natural images

    Get PDF
    Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method’s accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div

    Deep Learning for Semi-Automated Brain Claustrum Segmentation on Magnetic Resonance (MR) Images

    Get PDF
    Title from PDF of title page viewed June 18, 2018Thesis advisor: Yugyung LeeVitaIncludes bibliographical references (pages 73-78)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2018In recent years, Deep Learning (DL) has shown promising results with regard to conducting AI tasks such as computer vision and speech recognition. Specifically, DL demonstrated the state-of-the-art in computer vision tasks including image classification, segmentation, localization, and annotation. Convolutional Neural Network (CNN) models in DL have been applied to prevention, detection, and diagnosis in predictive medicine. Image segmentation plays a significant role in predictive medicine. However, there are huge challenges when performing DL-based automatic segmentation due to the nature of medical images such as heterogeneous modalities and formats, the very limited labeled training data, and the high-class imbalance in the labeled data. Furthermore, automatic segmentation becomes a challenging task, especially for Magnetic Resonance Images (MRI). In reality, it is a time- consuming procedure that requires trained biomedical experts to manually segment or annotate such MRI datasets. The need for automated segmentation or annotation is what motivates our work. In this thesis, we propose a semi-automated approach that aims to segment the claustrum in brain MRI images. We recognize that the claustrum is an information hub of human brains and can be used to find significant patterns from the segmentations. We applied a 2-Dimensional CNN model called U-net to segment the human brain dataset comprising 30 manually annotated subjects provided to us by the Department of Psychiatry at the University of Missouri-Kansas City. Our approach consisted of the following steps: (1) preprocessing, including converting, the data into Digital Imaging and Communications in Medicine (DICOM), re-sampling and selecting the claustrum slices, and applying an ROI selection; (2) building the claustrum model; (3) automatic segmentation; and (4) evaluation and validation. For the model validation, we used the cross-validation technique with n = 5. We administered the Dice coefficient index to evaluate the results and we achieved a Dice score of approximately 70%. A domain expert also evaluated the results.Introduction -- Background -- Related work -- Proposed solutions -- Proposed model application -- Conclusion and future wor

    Somatostin and intestinal inflammation

    Get PDF
    • …
    corecore