139 research outputs found

    Framework for the detection and classification of colorectal polyps

    No full text
    In this thesis we propose a framework for the detection and classification of colorectal polyps to assist endoscopists in bowel cancer screening. Such a system will help reduce not only the miss rate of possibly malignant polyps during screening but also reduce the number of unnecessary polypectomies where the histopathologic analysis could be spared. Our polyp detection scheme is based on a cascade filter to pre-process the incoming video frames, select a group of candidate polyp regions and then proceed to algorithmically isolate the most probable polyps based on their geometry. We also tested this system on a number of endoscopic and capsule endoscopy videos collected with the help of our clinical collaborators. Furthermore, we developed and tested a classification system for distinguishing cancerous colorectal polyps from non-cancerous ones. By analyzing the surface vasculature of high magnification polyp images from two endoscopic platforms we extracted a number of features based primarily on the vessel contrast, orientation and colour. The feature space was then filtered as to leave only the most relevant subset and this was subsequently used to train our classifier. In addition, we examined the scenario of splitting up the polyp surface into patches and including only the most feature rich areas into our classifier instead of the surface as a whole. The stability of our feature space relative to patch size was also examined to ensure reliable and robust classification. In addition, we devised a scale selection strategy to minimize the effect of inconsistencies in magnification and geometric polyp size between samples. Lastly, several techniques were also employed to ensure that our results will generalise well in real world practise. We believe this to be a solid step in forming a toolbox designed to aid endoscopists not only in the detection but also in the optical biopsy of colorectal polyps during in vivo colonoscopy.Open Acces

    Automatic Segmentation and Classification of Red and White Blood cells in Thin Blood Smear Slides

    Get PDF
    In this work we develop a system for automatic detection and classification of cytological images which plays an increasing important role in medical diagnosis. A primary aim of this work is the accurate segmentation of cytological images of blood smears and subsequent feature extraction, along with studying related classification problems such as the identification and counting of peripheral blood smear particles, and classification of white blood cell into types five. Our proposed approach benefits from powerful image processing techniques to perform complete blood count (CBC) without human intervention. The general framework in this blood smear analysis research is as follows. Firstly, a digital blood smear image is de-noised using optimized Bayesian non-local means filter to design a dependable cell counting system that may be used under different image capture conditions. Then an edge preservation technique with Kuwahara filter is used to recover degraded and blurred white blood cell boundaries in blood smear images while reducing the residual negative effect of noise in images. After denoising and edge enhancement, the next step is binarization using combination of Otsu and Niblack to separate the cells and stained background. Cells separation and counting is achieved by granulometry, advanced active contours without edges, and morphological operators with watershed algorithm. Following this is the recognition of different types of white blood cells (WBCs), and also red blood cells (RBCs) segmentation. Using three main types of features: shape, intensity, and texture invariant features in combination with a variety of classifiers is next step. The following features are used in this work: intensity histogram features, invariant moments, the relative area, co-occurrence and run-length matrices, dual tree complex wavelet transform features, Haralick and Tamura features. Next, different statistical approaches involving correlation, distribution and redundancy are used to measure of the dependency between a set of features and to select feature variables on the white blood cell classification. A global sensitivity analysis with random sampling-high dimensional model representation (RS-HDMR) which can deal with independent and dependent input feature variables is used to assess dominate discriminatory power and the reliability of feature which leads to an efficient feature selection. These feature selection results are compared in experiments with branch and bound method and with sequential forward selection (SFS), respectively. This work examines support vector machine (SVM) and Convolutional Neural Networks (LeNet5) in connection with white blood cell classification. Finally, white blood cell classification system is validated in experiments conducted on cytological images of normal poor quality blood smears. These experimental results are also assessed with ground truth manually obtained from medical experts

    Automated image analysis for petrographic image assessments

    Get PDF
    In this thesis, the algorithms developed for an automated image analysis toolkit called PetrograFX for petrographic image assessments, particularly thin section images, are presented. These algorithms perform two main functions, porosity determination and quartz grain measurements. For porosity determination, the pore space is segmented using a seeded region growing scheme in color space where the seeds are generated automatically based on the absolute R - B differential image. The porosity is then derived by pixel-counting to identify the pore space regions. For quartz grain measurements, adaptive thresholding is applied to make the system robust to the color variations in the entire image for the segmentation of the quartz grains. Median filtering and blob analysis are used to remove lines of fluid inclusions, which appear as black speckles and spots, on the quartz grains before the subsequent measurement operations are performed. The distance transformation and watershed transformation are then performed to separate connected objects. A modified watershed transformation is developed to eliminate false watersheds based on the physical nature of quartz grains. Finally, the grain are characterized in terms of NSD, which is the nominal sectional diameter, NSD distribution and sorting

    Visual Analysis in Traffic & Re-identification

    Get PDF

    Guiding object recognition: a shape model with co-activation networks

    Get PDF
    The goal of image understanding research is to develop techniques to automatically extract meaningful information from a population of images. This abstract goal manifests itself in a variety of application domains. Video understanding is a natural extension of image understanding. Many video understanding algorithms apply static-image algorithms to successive frames to identify patterns of consistency. This consumes a significant amount of irrelevant computation and may have erroneous results because static algorithms are not designed to indicate corresponding pixel locations between frames. Video is more than a collection of images, it is an ordered collection of images that exhibits temporal coherence, which is an additional feature like edges, colors, and textures. Motion information provides another level of visual information that can not be obtained from an isolated image. Leveraging motion cues prevents an algorithm from ?starting fresh? at each frame by focusing the region of attention. This approach is analogous to the attentional system of the human visual system. Relying on motion information alone is insufficient due to the aperture problem, where local motion information is ambiguous in at least one direction. Consequently, motion cues only provide leading and trailing motion edges and bottom-up approaches using gradient or region properties to complete moving regions are limited. Object recognition facilitates higher-level processing and is an integral component of image understanding. We present a components-based object detection and localization algorithm for static images. We show how this same system provides top-down segmentation for the detected object. We present a detailed analysis of the model dynamics during the localization process. This analysis shows consistent behavior in response to a variety of input, permitting model reduction and a substantial speed increase with little or no performance degradation. We present four specific enhancements to reduce false positives when instances of the target category are not present. First, a one-shot rule is used to discount coincident secondary hypotheses. Next, we demonstrate that the use of an entire shape model is inappropriate to localize any single instance and introduce the use of co-activation networks to represent the appropriate component relations for a particular recognition context. Next, we describe how the co-activation network can be combined with motion cues to overcome the aperture problem by providing context-specific, top-down shape information to achieve detection and segmentation in video. Finally, we present discriminating features arising from these enhancements and apply supervised learning techniques to embody the informational contribution of each approach to associate a confidence measure with each detection

    Iconic Indexing for Video Search

    Get PDF
    Submitted for the degree of Doctor of Philosophy, Queen Mary, University of London

    VIDEO FOREGROUND LOCALIZATION FROM TRADITIONAL METHODS TO DEEP LEARNING

    Get PDF
    These days, detection of Visual Attention Regions (VAR), such as moving objects has become an integral part of many Computer Vision applications, viz. pattern recognition, object detection and classification, video surveillance, autonomous driving, human-machine interaction (HMI), and so forth. The moving object identification using bounding boxes has matured to the level of localizing the objects along their rigid borders and the process is called foreground localization (FGL). Over the decades, many image segmentation methodologies have been well studied, devised, and extended to suit the video FGL. Despite that, still, the problem of video foreground (FG) segmentation remains an intriguing task yet appealing due to its ill-posed nature and myriad of applications. Maintaining spatial and temporal coherence, particularly at object boundaries, persists challenging, and computationally burdensome. It even gets harder when the background possesses dynamic nature, like swaying tree branches or shimmering water body, and illumination variations, shadows cast by the moving objects, or when the video sequences have jittery frames caused by vibrating or unstable camera mounts on a surveillance post or moving robot. At the same time, in the analysis of traffic flow or human activity, the performance of an intelligent system substantially depends on its robustness of localizing the VAR, i.e., the FG. To this end, the natural question arises as what is the best way to deal with these challenges? Thus, the goal of this thesis is to investigate plausible real-time performant implementations from traditional approaches to modern-day deep learning (DL) models for FGL that can be applicable to many video content-aware applications (VCAA). It focuses mainly on improving existing methodologies through harnessing multimodal spatial and temporal cues for a delineated FGL. The first part of the dissertation is dedicated for enhancing conventional sample-based and Gaussian mixture model (GMM)-based video FGL using probability mass function (PMF), temporal median filtering, and fusing CIEDE2000 color similarity, color distortion, and illumination measures, and picking an appropriate adaptive threshold to extract the FG pixels. The subjective and objective evaluations are done to show the improvements over a number of similar conventional methods. The second part of the thesis focuses on exploiting and improving deep convolutional neural networks (DCNN) for the problem as mentioned earlier. Consequently, three models akin to encoder-decoder (EnDec) network are implemented with various innovative strategies to improve the quality of the FG segmentation. The strategies are not limited to double encoding - slow decoding feature learning, multi-view receptive field feature fusion, and incorporating spatiotemporal cues through long-shortterm memory (LSTM) units both in the subsampling and upsampling subnetworks. Experimental studies are carried out thoroughly on all conditions from baselines to challenging video sequences to prove the effectiveness of the proposed DCNNs. The analysis demonstrates that the architectural efficiency over other methods while quantitative and qualitative experiments show the competitive performance of the proposed models compared to the state-of-the-art
    • …
    corecore