24,780 research outputs found
Towards Developing Computer Vision Algorithms and Architectures for Real-world Applications
abstract: Computer vision technology automatically extracts high level, meaningful information from visual data such as images or videos, and the object recognition and detection algorithms are essential in most computer vision applications. In this dissertation, we focus on developing algorithms used for real life computer vision applications, presenting innovative algorithms for object segmentation and feature extraction for objects and actions recognition in video data, and sparse feature selection algorithms for medical image analysis, as well as automated feature extraction using convolutional neural network for blood cancer grading.
To detect and classify objects in video, the objects have to be separated from the background, and then the discriminant features are extracted from the region of interest before feeding to a classifier. Effective object segmentation and feature extraction are often application specific, and posing major challenges for object detection and classification tasks. In this dissertation, we address effective object flow based ROI generation algorithm for segmenting moving objects in video data, which can be applied in surveillance and self driving vehicle areas. Optical flow can also be used as features in human action recognition algorithm, and we present using optical flow feature in pre-trained convolutional neural network to improve performance of human action recognition algorithms. Both algorithms outperform the state-of-the-arts at their time.
Medical images and videos pose unique challenges for image understanding mainly due to the fact that the tissues and cells are often irregularly shaped, colored, and textured, and hand selecting most discriminant features is often difficult, thus an automated feature selection method is desired. Sparse learning is a technique to extract the most discriminant and representative features from raw visual data. However, sparse learning with \textit{L1} regularization only takes the sparsity in feature dimension into consideration; we improve the algorithm so it selects the type of features as well; less important or noisy feature types are entirely removed from the feature set. We demonstrate this algorithm to analyze the endoscopy images to detect unhealthy abnormalities in esophagus and stomach, such as ulcer and cancer. Besides sparsity constraint, other application specific constraints and prior knowledge may also need to be incorporated in the loss function in sparse learning to obtain the desired results. We demonstrate how to incorporate similar-inhibition constraint, gaze and attention prior in sparse dictionary selection for gastroscopic video summarization that enable intelligent key frame extraction from gastroscopic video data. With recent advancement in multi-layer neural networks, the automatic end-to-end feature learning becomes feasible. Convolutional neural network mimics the mammal visual cortex and can extract most discriminant features automatically from training samples. We present using convolutinal neural network with hierarchical classifier to grade the severity of Follicular Lymphoma, a type of blood cancer, and it reaches 91\% accuracy, on par with analysis by expert pathologists.
Developing real world computer vision applications is more than just developing core vision algorithms to extract and understand information from visual data; it is also subject to many practical requirements and constraints, such as hardware and computing infrastructure, cost, robustness to lighting changes and deformation, ease of use and deployment, etc.The general processing pipeline and system architecture for the computer vision based applications share many similar design principles and architecture. We developed common processing components and a generic framework for computer vision application, and a versatile scale adaptive template matching algorithm for object detection. We demonstrate the design principle and best practices by developing and deploying a complete computer vision application in real life, building a multi-channel water level monitoring system, where the techniques and design methodology can be generalized to other real life applications. The general software engineering principles, such as modularity, abstraction, robust to requirement change, generality, etc., are all demonstrated in this research.Dissertation/ThesisDoctoral Dissertation Computer Science 201
Acute Lymphoblastic Leukemia Blood Cells Prediction Using Deep Learning & Transfer Learning Technique
White blood cells called lymphocytes are the target of the blood malignancy known as acute lymphoblastic leukemia (ALL). In the domain of medical image analysis, deep learning and transfer learning methods have recently showcased significant promise, particularly in tasks such as identifying and categorizing various types of cancer. Using microscopic pictures, we suggest a deep learning and transfer learning-based method in this research work for predicting ALL blood cells. We use a pre-trained convolutional neural network (CNN) model to extract pertinent features from the microscopic images of blood cells during the feature extraction step. To accurately categorize the blood cells into leukemia and non- leukemia classes, a classification model is built using a transfer learning technique employing the collected features. We use a publicly accessible collection of microscopic blood cell pictures, which contains samples from both leukemia and non-leukemia, to assess the suggested method. Our experimental findings show that the suggested method successfully predicts ALL blood cells with high accuracy. The method enhances early ALL detection and diagnosis, which may result in better patient treatment outcomes. Future research will concentrate on larger and more varied datasets and investigate the viability of integrating it into clinical processes for real-time ALL prediction
Prospects for Theranostics in Neurosurgical Imaging: Empowering Confocal Laser Endomicroscopy Diagnostics via Deep Learning
Confocal laser endomicroscopy (CLE) is an advanced optical fluorescence
imaging technology that has the potential to increase intraoperative precision,
extend resection, and tailor surgery for malignant invasive brain tumors
because of its subcellular dimension resolution. Despite its promising
diagnostic potential, interpreting the gray tone fluorescence images can be
difficult for untrained users. In this review, we provide a detailed
description of bioinformatical analysis methodology of CLE images that begins
to assist the neurosurgeon and pathologist to rapidly connect on-the-fly
intraoperative imaging, pathology, and surgical observation into a
conclusionary system within the concept of theranostics. We present an overview
and discuss deep learning models for automatic detection of the diagnostic CLE
images and discuss various training regimes and ensemble modeling effect on the
power of deep learning predictive models. Two major approaches reviewed in this
paper include the models that can automatically classify CLE images into
diagnostic/nondiagnostic, glioma/nonglioma, tumor/injury/normal categories and
models that can localize histological features on the CLE images using weakly
supervised methods. We also briefly review advances in the deep learning
approaches used for CLE image analysis in other organs. Significant advances in
speed and precision of automated diagnostic frame selection would augment the
diagnostic potential of CLE, improve operative workflow and integration into
brain tumor surgery. Such technology and bioinformatics analytics lend
themselves to improved precision, personalization, and theranostics in brain
tumor treatment.Comment: See the final version published in Frontiers in Oncology here:
https://www.frontiersin.org/articles/10.3389/fonc.2018.00240/ful
- …