35 research outputs found
Deep learning-based diagnostic system for malignant liver detection
Cancer is the second most common cause of death of human beings, whereas liver cancer is the fifth most
common cause of mortality. The prevention of deadly diseases in living beings requires timely, independent,
accurate, and robust detection of ailment by a computer-aided diagnostic (CAD) system. Executing such intelligent CAD requires some preliminary steps, including preprocessing, attribute analysis, and identification.
In recent studies, conventional techniques have been used to develop computer-aided diagnosis algorithms.
However, such traditional methods could immensely affect the structural properties of processed images with
inconsistent performance due to variable shape and size of region-of-interest. Moreover, the unavailability of sufficient datasets makes the performance of the proposed methods doubtful for commercial use.
To address these limitations, I propose novel methodologies in this dissertation. First, I modified a
generative adversarial network to perform deblurring and contrast adjustment on computed tomography
(CT) scans. Second, I designed a deep neural network with a novel loss function for fully automatic precise
segmentation of liver and lesions from CT scans. Third, I developed a multi-modal deep neural network
to integrate pathological data with imaging data to perform computer-aided diagnosis for malignant liver
detection.
The dissertation starts with background information that discusses the proposed study objectives and the workflow. Afterward, Chapter 2 reviews a general schematic for developing a computer-aided algorithm, including image acquisition techniques, preprocessing steps, feature extraction approaches, and machine learning-based prediction methods.
The first study proposed in Chapter 3 discusses blurred images and their possible effects on classification.
A novel multi-scale GAN network with residual image learning is proposed to deblur images. The second
method in Chapter 4 addresses the issue of low-contrast CT scan images. A multi-level GAN is utilized
to enhance images with well-contrast regions. Thus, the enhanced images improve the cancer diagnosis
performance. Chapter 5 proposes a deep neural network for the segmentation of liver and lesions from
abdominal CT scan images. A modified Unet with a novel loss function can precisely segment minute lesions.
Similarly, Chapter 6 introduces a multi-modal approach for liver cancer variants diagnosis. The pathological data are integrated with CT scan images to diagnose liver cancer variants.
In summary, this dissertation presents novel algorithms for preprocessing and disease detection. Furthermore,
the comparative analysis validates the effectiveness of proposed methods in computer-aided diagnosis
Motion Offset for Blur Modeling
Motion blur caused by the relative movement between the camera and the subject is often an undesirable degradation of the image quality. In most conventional deblurring methods, a blur kernel is estimated for image deconvolution. Due to the ill-posed nature, predefined priors are proposed to suppress the ill-posedness. However, these predefined priors can only handle some specific situations. In order to achieve a better deblurring performance on dynamic scene, deep-learning based methods are proposed to learn a mapping function that restore the sharp image from a blurry image. The blur may be implicitly modelled in feature extraction module. However, the blur modelled from the paired dataset cannot be well generalized to some real-world scenes. To summary, an accurate and dynamic blur model that more closely approximates real-world blur is needed.
By revisiting the principle of camera exposure, we can model the blur with the displacements between sharp pixels and the exposed pixel, namely motion offsets. Given specific physical constraints, motion offsets are able to form different exposure trajectories (i.e. linear, quadratic). Compare to conventional blur kernel, our proposed motion offsets are a more rigorous approximation for real-world blur, since they can constitute a non-linear and non-uniform motion field. Through learning from dynamic scene dataset, an accurate and spatial-variant motion offset field is obtained.
With accurate motion information and a compact blur modeling method, we explore the ways of utilizing motion information to facilitate multiple blur-related tasks. By introducing recovered motion offsets, we build up a motion-aware and spatial-variant convolution. For extracting a video clip from a blurry image, motion offsets can provide an explicit (non-)linear motion trajectory for interpolating. We also work towards a better image deblurring performance in real-world scenarios by improving the generalization ability of the deblurring model
Multi-scale Grid Network for Image Deblurring with High-frequency Guidance
It has been demonstrated that the blurring process reduces the high-frequency information of the original sharp image, so the main challenge for image deblurring is to reconstruct high-frequency information from the blurry image. In this paper, we propose a novel image deblurring framework to focus on the reconstruction of high-frequency information, which consists of two main subnetworks: a high-frequency reconstruction subnetwork (HFRSN) and a multi-scale grid subnetwork (MSGSN). The HFRSN is built to reconstruct latent high-frequency information from multiple scale blurry images. The MSGSN performs deblurring processes with high-frequency guidance at different scales simultaneously. Besides, in order to better use high-frequency information to restore sharpening images, we designed a high-frequency information aggregation (HFAG) module and a high-frequency information attention (HFAT) module in MSGSN. The HFAG module is designed to fuse high-frequency features and image features at the feature extraction stage, and the HFAT module is built to enhance the feature reconstruction stage. Extensive experiments on different datasets show the effectiveness and efficiency of our method
Learning From Multi-Frame Data
Multi-frame data-driven methods bear the promise that aggregating multiple observations leads to better estimates of target quantities than a single (still) observation.
This thesis examines how data-driven approaches such as deep neural networks should be constructed to improve over single-frame-based counterparts.
Besides algorithmic changes, as for example in the design of artificial neural network architectures or the algorithm itself, such an examination is inextricably linked with the consideration of the synthesis of synthetic training data in meaningful size (even if no annotations are available) and quality (if real ground-truth acquisition is not possible), which capture all temporal effects with high fidelity.
We start with the introduction of a new algorithm to accelerate a nonparametric learning algorithm by using a GPU adapted implementation to search for the nearest neighbor.
While the approaches known so far are clearly surpassed, this empirically reveals that the data generated can be managed within a reasonable time and that several inputs can be processed in parallel even under hardware restrictions.
Based on a learning-based solution, we introduce a novel training protocol to bridge the need for carefully curated training data and demonstrate better performance and robustness than a non-parametric search for the nearest neighbor via temporal video alignments.
Effective learning in the absence of labels is required when dealing with larger amounts of data that are easy to capture but not feasible or at least costly to label.
In addition, we show new ways to generate plausible and realistic synthesized data and their inevitability when it comes to closing the gap to expensive and almost infeasible real-world acquisition.
These eventually achieve state-of-the-art results in classical image processing tasks such as reflection removal and video deblurring
A Chronological Survey of Theoretical Advancements in Generative Adversarial Networks for Computer Vision
Generative Adversarial Networks (GANs) have been workhorse generative models
for last many years, especially in the research field of computer vision.
Accordingly, there have been many significant advancements in the theory and
application of GAN models, which are notoriously hard to train, but produce
good results if trained well. There have been many a surveys on GANs,
organizing the vast GAN literature from various focus and perspectives.
However, none of the surveys brings out the important chronological aspect: how
the multiple challenges of employing GAN models were solved one-by-one over
time, across multiple landmark research works. This survey intends to bridge
that gap and present some of the landmark research works on the theory and
application of GANs, in chronological order