44 research outputs found

    Learning to Segment Breast Biopsy Whole Slide Images

    Full text link
    We trained and applied an encoder-decoder model to semantically segment breast biopsy images into biologically meaningful tissue labels. Since conventional encoder-decoder networks cannot be applied directly on large biopsy images and the different sized structures in biopsies present novel challenges, we propose four modifications: (1) an input-aware encoding block to compensate for information loss, (2) a new dense connection pattern between encoder and decoder, (3) dense and sparse decoders to combine multi-level features, (4) a multi-resolution network that fuses the results of encoder-decoders run on different resolutions. Our model outperforms a feature-based approach and conventional encoder-decoders from the literature. We use semantic segmentations produced with our model in an automated diagnosis task and obtain higher accuracies than a baseline approach that employs an SVM for feature-based segmentation, both using the same segmentation-based diagnostic features.Comment: Added more WSI images in appendi

    PU-NET Deep Learning Architecture for Gliomas Brain Tumor Segmentation in Magnetic Resonance Images

    Get PDF
    Automatic medical image segmentation is one of the main tasks for many organs and pathology structures delineation. It is also a crucial technique in the posterior clinical examination of brain tumors, like applying radiotherapy or tumor restrictions. Various image segmentation techniques have been proposed and applied to different image types. Recently, it has been shown that the deep learning approach accurately segments images, and its implementation is usually straightforward. In this paper, we proposed a novel approach, called PU-NET, for automatic brain tumor segmentation in multi-modal magnetic resonance images (MRI). We introduced an input processing block to a customized fully convolutional network derived from the U-Net network to handle the multi-modal inputs. We performed experiments over the Brain Tumor Segmentation (BRATS) dataset collected in 2018 and achieved Dice scores of 90.5%, 82.7%, and 80.3% for the whole tumor, tumor core, and enhancing tumor classes, respectively. This study provides promising results compared to the deep learning methods used in this context

    Magnetic resonance image-based brain tumour segmentation methods : a systematic review

    Get PDF
    Background: Image segmentation is an essential step in the analysis and subsequent characterisation of brain tumours through magnetic resonance imaging. In the literature, segmentation methods are empowered by open-access magnetic resonance imaging datasets, such as the brain tumour segmentation dataset. Moreover, with the increased use of artificial intelligence methods in medical imaging, access to larger data repositories has become vital in method development. Purpose: To determine what automated brain tumour segmentation techniques can medical imaging specialists and clinicians use to identify tumour components, compared to manual segmentation. Methods: We conducted a systematic review of 572 brain tumour segmentation studies during 2015–2020. We reviewed segmentation techniques using T1-weighted, T2-weighted, gadolinium-enhanced T1-weighted, fluid-attenuated inversion recovery, diffusion-weighted and perfusion-weighted magnetic resonance imaging sequences. Moreover, we assessed physics or mathematics-based methods, deep learning methods, and software-based or semi-automatic methods, as applied to magnetic resonance imaging techniques. Particularly, we synthesised each method as per the utilised magnetic resonance imaging sequences, study population, technical approach (such as deep learning) and performance score measures (such as Dice score). Statistical tests: We compared median Dice score in segmenting the whole tumour, tumour core and enhanced tumour. Results: We found that T1-weighted, gadolinium-enhanced T1-weighted, T2-weighted and fluid-attenuated inversion recovery magnetic resonance imaging are used the most in various segmentation algorithms. However, there is limited use of perfusion-weighted and diffusion-weighted magnetic resonance imaging. Moreover, we found that the U-Net deep learning technology is cited the most, and has high accuracy (Dice score 0.9) for magnetic resonance imaging-based brain tumour segmentation. Conclusion: U-Net is a promising deep learning technology for magnetic resonance imaging-based brain tumour segmentation. The community should be encouraged to contribute open-access datasets so training, testing and validation of deep learning algorithms can be improved, particularly for diffusion- and perfusion-weighted magnetic resonance imaging, where there are limited datasets available

    Research progress on deep learning in magnetic resonance imaging–based diagnosis and treatment of prostate cancer: a review on the current status and perspectives

    Get PDF
    Multiparametric magnetic resonance imaging (mpMRI) has emerged as a first-line screening and diagnostic tool for prostate cancer, aiding in treatment selection and noninvasive radiotherapy guidance. However, the manual interpretation of MRI data is challenging and time-consuming, which may impact sensitivity and specificity. With recent technological advances, artificial intelligence (AI) in the form of computer-aided diagnosis (CAD) based on MRI data has been applied to prostate cancer diagnosis and treatment. Among AI techniques, deep learning involving convolutional neural networks contributes to detection, segmentation, scoring, grading, and prognostic evaluation of prostate cancer. CAD systems have automatic operation, rapid processing, and accuracy, incorporating multiple sequences of multiparametric MRI data of the prostate gland into the deep learning model. Thus, they have become a research direction of great interest, especially in smart healthcare. This review highlights the current progress of deep learning technology in MRI-based diagnosis and treatment of prostate cancer. The key elements of deep learning-based MRI image processing in CAD systems and radiotherapy of prostate cancer are briefly described, making it understandable not only for radiologists but also for general physicians without specialized imaging interpretation training. Deep learning technology enables lesion identification, detection, and segmentation, grading and scoring of prostate cancer, and prediction of postoperative recurrence and prognostic outcomes. The diagnostic accuracy of deep learning can be improved by optimizing models and algorithms, expanding medical database resources, and combining multi-omics data and comprehensive analysis of various morphological data. Deep learning has the potential to become the key diagnostic method in prostate cancer diagnosis and treatment in the future

    A survey, review, and future trends of skin lesion segmentation and classification

    Get PDF
    The Computer-aided Diagnosis or Detection (CAD) approach for skin lesion analysis is an emerging field of research that has the potential to alleviate the burden and cost of skin cancer screening. Researchers have recently indicated increasing interest in developing such CAD systems, with the intention of providing a user-friendly tool to dermatologists to reduce the challenges encountered or associated with manual inspection. This article aims to provide a comprehensive literature survey and review of a total of 594 publications (356 for skin lesion segmentation and 238 for skin lesion classification) published between 2011 and 2022. These articles are analyzed and summarized in a number of different ways to contribute vital information regarding the methods for the development of CAD systems. These ways include: relevant and essential definitions and theories, input data (dataset utilization, preprocessing, augmentations, and fixing imbalance problems), method configuration (techniques, architectures, module frameworks, and losses), training tactics (hyperparameter settings), and evaluation criteria. We intend to investigate a variety of performance-enhancing approaches, including ensemble and post-processing. We also discuss these dimensions to reveal their current trends based on utilization frequencies. In addition, we highlight the primary difficulties associated with evaluating skin lesion segmentation and classification systems using minimal datasets, as well as the potential solutions to these difficulties. Findings, recommendations, and trends are disclosed to inform future research on developing an automated and robust CAD system for skin lesion analysis

    Effective Approaches for Improving the Efficiency of Deep Convolutional Neural Networks for Image Classification

    Get PDF
    Aquesta tesi presenta dos mètodes per reduir el nombre de paràmetres i càlculs de punt flotant a arquitectures DCNN utilitzades amb classificació d'imatges. El primer mètode és una modificació de les primeres capes d‟una DCNN que divideix els canals d‟una imatge codificada amb l‟espai de color CIE Lab en dos camins separats, un per al canal acromàtic i un altre per a la resta de canals cromàtics. Modifiquem una arquitectura Inception V3 per incloure una branca específica per a dades acromàtiques (canal L) i una altra branca específica per a dades cromàtiques (canals AB). Aquesta modificació aprofita el desacoblament de la informació cromàtica i acromàtica. A més, la divisió de branques redueix el nombre de paràmetres entrenables i la càrrega de càlcul fins a un 50% de les xifres originals a les capes modificades. Vam aconseguir una state-of-the-art precisió classificació de 99,48% a Plant Village. També trobem una millor fiabilitat en la classificació d'imatges quan les imatges d'entrada contenen soroll. A les DCNNs, el recompte de paràmetres en convolucions puntuals creix ràpidament a causa de la multiplicació dels filtres i canals dentrada de la capa anterior. Per gestionar aquest creixement, el segon mètode d'optimització fa que les convolucions puntuals tinguin pocs paràmetres mitjançant l'ús de branques paral·leles, on cada branca conté un grup de filtres i processa una fracció dels canals d'entrada. Per evitar degradar la capacitat daprenentatge de les DCNN, proposem intercalar la sortida dels filtres de branques paral·leles en capes intermèdies de convolucions puntuals successives. Provem la nostra optimització en un EfficientNet-B0 com a arquitectura de referència i realitzem proves de classificació als conjunts de dades CIFAR-10, Histologia del càncer colorectal i Malària. Per a cada conjunt de dades, la nostra optimització aconsegueix un estalvi del 76%, 89% i 91% de la quantitat de paràmetres entrenables de EfficientNet-B0, mantenint la precisió de classificació.Esta tesis presenta dos métodos para reducir el número de parámetros y cálculos de punto flotante en arquitecturas DCNN utilizadas con clasificación de imágenes. El primer método es una modificación de las primeras capas de una DCNN que divide los canales de una imagen codificada con el espacio de color CIE Lab en dos caminos separados, uno para el canal acromático y otro para el resto de canales cromáticos. Modificamos una arquitectura Inception V3 para incluir una rama específica para datos acromáticos (canal L) y otra rama específica para datos cromáticos (canales AB). Esta modificación aprovecha el desacoplamiento de la información cromática y acromática. Además, la división de ramas reduce el número de parámetros entrenables y la carga de cálculo hasta en un 50% de las cifras originales en las capas modificadas. Logramos una state-of-the-art precisión clasificación de 99,48% en Plant Village. También encontramos una mejor confiabilidad en la clasificación de imágenes cuando las imágenes de entrada contienen ruido. En las DCNNs, el conteo de parámetros en convoluciones puntuales crece rápidamente debido a la multiplicación de los filtros y canales de entrada de la capa anterior. Para manejar este crecimiento, el segundo método de optimización hace que las convoluciones puntuales tengan pocos parametros mediante el empleo de ramas paralelas, donde cada rama contiene un grupo de filtros y procesa una fracción de los canales de entrada. Para evitar degradar la capacidad de aprendizaje de las DCNN, proponemos intercalar la salida de los filtros de ramas paralelas en capas intermedias de convoluciones puntuales sucesivas. Probamos nuestra optimización en un EfficientNet-B0 como arquitectura de referencia y realizamos pruebas de clasificación en los conjuntos de datos CIFAR-10, Histología del cáncer colorrectal y Malaria. Para cada conjunto de datos, nuestra optimización logra un ahorro del 76 %, 89 % y 91 % de la cantidad de parámetros entrenables de EfficientNet-B0, manteniendo la precisión de clasificación.This thesis presents two methods for reducing the number of parameters and floating-point computations in existing DCNN architectures used with image classification. The first method is a modification of the first layers of a DCNN that splits the channels of an image encoded with CIE Lab color space in two separate paths, one for the achromatic channel and another for the remaining chromatic channels. We modified an Inception V3 architecture to include one branch specific for achromatic data (L channel) and another branch specific for chromatic data (AB channels). This modification takes advantage of the decoupling of chromatic and achromatic information. Besides, splitting branches reduces the number of trainable parameters and computation load by up to 50% of the original figures in the modified layers. We achieved a state-of-the-art classification accuracy of 99.48% on the Plant Village dataset. This two-branch method improves image classification reliability when the input images contain noise. In DCNNs, the parameter count in pointwise convolutions quickly grows due to the multiplication of the filters and input channels from the preceding layer. To handle this growth, the second optimization method makes pointwise convolutions parameter-efficient via parallel branching. Each branch contains a group of filters and processes a fraction of the input channels. To avoid degrading the learning capability of DCNNs, we propose interleaving the filters' output from separate branches at intermediate layers of successive pointwise convolutions. We tested our optimization on an EfficientNet-B0 as a baseline architecture and made classification tests on the CIFAR-10, Colorectal Cancer Histology, and Malaria datasets. For each dataset, our optimization saves 76%, 89%, and 91% of the number of trainable parameters of EfficientNet-B0, while keeping its test classification accuracy

    Human pose and action recognition

    Get PDF
    This thesis focuses on detection of persons and pose recognition using neural networks. The goal is to detect human body poses in a visual scene with multiple persons and to use this information in order to recognize human activity. This is achieved by rst detecting persons in a scene and then by estimating their body joints in order to infer articulated poses. The work developed in this thesis explored neural networks and deep learning methods. Deep learning allows to employ computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have greatly improved the state-of-the-art in many domains such as speech recognition and visual object detection and classi cation. Deep learning discovers intricate structure in data by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation provided by the previous one. Person detection, in general, is a di cult task due to a large variability of representation due to di erent factors such as scales, views and occlusion. An object detection framework based on multi-stage convolutional features for pedestrian detection is proposed in this thesis. This framework extends the Fast R-CNN framework for the combination of several convolutional features from di erent stages of a CNN (Convolutional Neural Network) to improve the detector's accuracy. This provides high quality detections of persons in a visual scene, which are then used as input in conjunction with a human pose estimation model in order to estimate human body joint locations of multiple persons in an image. Human pose estimation is done by a deep convolutional neural network composed of a series of residual auto-encoders. These produce multiple predictions which are later combined to provide a heatmap prediction of human body joints. In this network topology, features are processed across all scales capturing the various spatial relationships associated with the body. Repeated bottom-up and top-down processing with intermediate supervision for each auto-encoder network is applied. This results in very accurate 2D heatmaps of body joint predictions. The methods presented in this thesis were benchmarked against other topperforming methods on popular datasets for human pedestrian and pose estimation, achieving good results compared with other state-of-the-art algorithms.Esta tese foca a detec c~ao de pessoas e o reconhecimento de poses usando redes neuronais. O objectivo e detectar poses humanas num ambiente (cena) com m ultiplas pessoas e usar essa informa c~ao para reconhecer actividade humana. Isto e alcan cado ao detectar, em primeiro lugar, pessoas numa cena e, seguidamente, estimar as suas juntas corporais de modo a inferir poses articuladas. O trabalho desenvolvido nesta tese explorou m etodos de redes neuronais e de aprendizagem profunda. A aprendizagem profunda permite que modelos computacionais compostos por m ultiplas camadas de processamento aprendam representa c~oes de dados com m ultiplos n veis de abstra c~ao. Estes m etodos t^em drasticamente melhorado o estado-da-arte em muitos dom nios como o reconhecimento de fala e a classi ca c~ao e o reconhecimento de objectos visuais. A aprendizagem profunda descobre estruturas intr nsecas em conjuntos de dados ao usar algoritmos de propaga c~ao inversa (backpropagation) para indicar como uma m aquina deve alterar os seus par^ametros internos que, por sua vez, s~ao usados para processar a representa c~ao em cada camada a partir da representa c~ao da camada anterior. A detec c~ao de pessoas em geral e uma tarefa dif cil dado a grande variabilidade de representa c~oes devido a diferentes escalas, vistas e oclus~oes. Uma estrutura de detec c~ao de objectos baseada em caracter sticas convolucionais de m ultiplos est agios para a detec c~ao de pedestres e proposta nesta tese. Esta estrutura estende a estrutura Fast R-CNN com a combina c~ao de v arias caracter sticas convolucionais de diferentes est agios da CNN (Convolutional Neural Network) usada de modo a melhorar a precis~ao do detector. Isto proporciona detec c~oes de pessoas com elevada abilidade numa cena, que s~ao posteriormente conjuntamente usadas como entrada no modelo de estima c~ao de poses humanas de modo a estimar a localiza c~ao de articula c~oes humanas para a detec c~ao de m ultiplas pessoas numa imagem. A estima c~ao de poses humanas e obtido atrav es de redes neuronais convolucionais profundas que s~ao compostas por uma s erie de auto-codi cadores residuais que fornecem m ultiplas previs~oes que s~ao, posteriormente, combinadas para fornecer um \mapa de calor" de articula c~oes corporais. Nesta topologia de rede, as caracter sticas da imagem s~ao processadas ao longo de v arias escalas, capturando as v arias rela c~oes espaciais associadas com o corpo humano. Repetidos processos de baixo-para-cima e de cima-para-baixo com supervis~ao interm edia para cada autocodi cador s~ao aplicados. Isto resulta em mapas de calor 2D muito precisos de estima c~oes de articula c~oes corporais de pessoas. Os m etodos apresentados nesta tese foram comparados com outros m etodos de alto desempenho em bases de dados de detec c~ao de pessoas e de reconhecimento de poses humanas, alcan cando muito bons resultados comparando com outros algoritmos do estado-da-arte
    corecore