45 research outputs found

    Multichannel Residual Cues for Fine-Grained Classification in Wireless Capsule Endoscopy

    Get PDF
    Early diagnosis of gastrointestinal pathologies leads to timely medical intervention and prevents disease development. Wireless Capsule Endoscopy (WCE) is used as a non-invasive alternative for gastrointestinal examination. WCE can capture images despite the structural complexity presented by human anatomy and can help in detecting pathologies. However, despite recent progress in fine-grained pathology classification and detection, limited works focus on generalization. We propose a multi-channel encoder-decoder network for learning a generalizable fine-grained pathology classifier. Specifically, we propose to use structural residual cues to explicitly impose the network to learn pathology traces. While residuals are extracted using well-established 2D wavelet decomposition, we also propose to use colour channels to learn discriminative cues in WCE images (like red color in bleeding). With less than 40% data (fewer than 2500 labels) used for training, we demonstrate the effectiveness of our approach in classifying different pathologies on two different WCE datasets (different capsule modalities). With a comprehensive benchmark for WCE abnormality and multi-class classification, we illustrate the generalizability of the proposed approach on both datasets, where our results perform better than the state-of-the-art with much fewer labels in abnormality sensitivity on several of nine different pathologies and establish a new benchmark with specificity >97% across classes.publishedVersio

    Vision-based retargeting for endoscopic navigation

    Get PDF
    Endoscopy is a standard procedure for visualising the human gastrointestinal tract. With the advances in biophotonics, imaging techniques such as narrow band imaging, confocal laser endomicroscopy, and optical coherence tomography can be combined with normal endoscopy for assisting the early diagnosis of diseases, such as cancer. In the past decade, optical biopsy has emerged to be an effective tool for tissue analysis, allowing in vivo and in situ assessment of pathological sites with real-time feature-enhanced microscopic images. However, the non-invasive nature of optical biopsy leads to an intra-examination retargeting problem, which is associated with the difficulty of re-localising a biopsied site consistently throughout the whole examination. In addition to intra-examination retargeting, retargeting of a pathological site is even more challenging across examinations, due to tissue deformation and changing tissue morphologies and appearances. The purpose of this thesis is to address both the intra- and inter-examination retargeting problems associated with optical biopsy. We propose a novel vision-based framework for intra-examination retargeting. The proposed framework is based on combining visual tracking and detection with online learning of the appearance of the biopsied site. Furthermore, a novel cascaded detection approach based on random forests and structured support vector machines is developed to achieve efficient retargeting. To cater for reliable inter-examination retargeting, the solution provided in this thesis is achieved by solving an image retrieval problem, for which an online scene association approach is proposed to summarise an endoscopic video collected in the first examination into distinctive scenes. A hashing-based approach is then used to learn the intrinsic representations of these scenes, such that retargeting can be achieved in subsequent examinations by retrieving the relevant images using the learnt representations. For performance evaluation of the proposed frameworks, extensive phantom, ex vivo and in vivo experiments have been conducted, with results demonstrating the robustness and potential clinical values of the methods proposed.Open Acces

    Artificial Intelligence Techniques in Medical Imaging: A Systematic Review

    Get PDF
    This scientific review presents a comprehensive overview of medical imaging modalities and their diverse applications in artificial intelligence (AI)-based disease classification and segmentation. The paper begins by explaining the fundamental concepts of AI, machine learning (ML), and deep learning (DL). It provides a summary of their different types to establish a solid foundation for the subsequent analysis. The prmary focus of this study is to conduct a systematic review of research articles that examine disease classification and segmentation in different anatomical regions using AI methodologies. The analysis includes a thorough examination of the results reported in each article, extracting important insights and identifying emerging trends. Moreover, the paper critically discusses the challenges encountered during these studies, including issues related to data availability and quality, model generalization, and interpretability. The aim is to provide guidance for optimizing technique selection. The analysis highlights the prominence of hybrid approaches, which seamlessly integrate ML and DL techniques, in achieving effective and relevant results across various disease types. The promising potential of these hybrid models opens up new opportunities for future research in the field of medical diagnosis. Additionally, addressing the challenges posed by the limited availability of annotated medical images through the incorporation of medical image synthesis and transfer learning techniques is identified as a crucial focus for future research efforts

    LVM-Med: Learning Large-Scale Self-Supervised Vision Models for Medical Imaging via Second-order Graph Matching

    Full text link
    Obtaining large pre-trained models that can be fine-tuned to new tasks with limited annotated samples has remained an open challenge for medical imaging data. While pre-trained deep networks on ImageNet and vision-language foundation models trained on web-scale data are prevailing approaches, their effectiveness on medical tasks is limited due to the significant domain shift between natural and medical images. To bridge this gap, we introduce LVM-Med, the first family of deep networks trained on large-scale medical datasets. We have collected approximately 1.3 million medical images from 55 publicly available datasets, covering a large number of organs and modalities such as CT, MRI, X-ray, and Ultrasound. We benchmark several state-of-the-art self-supervised algorithms on this dataset and propose a novel self-supervised contrastive learning algorithm using a graph-matching formulation. The proposed approach makes three contributions: (i) it integrates prior pair-wise image similarity metrics based on local and global information; (ii) it captures the structural constraints of feature embeddings through a loss function constructed via a combinatorial graph-matching objective; and (iii) it can be trained efficiently end-to-end using modern gradient-estimation techniques for black-box solvers. We thoroughly evaluate the proposed LVM-Med on 15 downstream medical tasks ranging from segmentation and classification to object detection, and both for the in and out-of-distribution settings. LVM-Med empirically outperforms a number of state-of-the-art supervised, self-supervised, and foundation models. For challenging tasks such as Brain Tumor Classification or Diabetic Retinopathy Grading, LVM-Med improves previous vision-language models trained on 1 billion masks by 6-7% while using only a ResNet-50.Comment: Update Appendi

    Convolutional Neural Network in Pattern Recognition

    Get PDF
    Since convolutional neural network (CNN) was first implemented by Yann LeCun et al. in 1989, CNN and its variants have been widely implemented to numerous topics of pattern recognition, and have been considered as the most crucial techniques in the field of artificial intelligence and computer vision. This dissertation not only demonstrates the implementation aspect of CNN, but also lays emphasis on the methodology of neural network (NN) based classifier. As known to many, one general pipeline of NN-based classifier can be recognized as three stages: pre-processing, inference by models, and post-processing. To demonstrate the importance of pre-processing techniques, this dissertation presents how to model actual problems in medical pattern recognition and image processing by introducing conceptual abstraction and fuzzification. In particular, a transformer on the basis of self-attention mechanism, namely beat-rhythm transformer, greatly benefits from correct R-peak detection results and conceptual fuzzification. Recently proposed self-attention mechanism has been proven to be the top performer in the fields of computer vision and natural language processing. In spite of the pleasant accuracy and precision it has gained, it usually consumes huge computational resources to perform self-attention. Therefore, realtime global attention network is proposed to make a better trade-off between efficiency and performance for the task of image segmentation. To illustrate more on the stage of inference, we also propose models to detect polyps via Faster R-CNN - one of the most popular CNN-based 2D detectors, as well as a 3D object detection pipeline for regressing 3D bounding boxes from LiDAR points and stereo image pairs powered by CNN. The goal for post-processing stage is to refine artifacts inferred by models. For the semantic segmentation task, the dilated continuous random field is proposed to be better fitted to CNN-based models than the widely implemented fully-connected continuous random field. Proposed approaches can be further integrated into a reinforcement learning architecture for robotics

    Visual and Camera Sensors

    Get PDF
    This book includes 13 papers published in Special Issue ("Visual and Camera Sensors") of the journal Sensors. The goal of this Special Issue was to invite high-quality, state-of-the-art research papers dealing with challenging issues in visual and camera sensors

    Machine Learning/Deep Learning in Medical Image Processing

    Get PDF
    Many recent studies on medical image processing have involved the use of machine learning (ML) and deep learning (DL). This special issue, ā€œMachine Learning/Deep Learning in Medical Image Processingā€, has been launched to provide an opportunity for researchers in the area of medical image processing to highlight recent developments made in their fields with ML/DL. Seven excellent papers that cover a wide variety of medical/clinical aspects are selected in this special issue
    corecore