449 research outputs found
An attention model and its application in man-made scene interpretation
The ultimate aim of research into computer vision is designing a system which interprets
its surrounding environment in a similar way the human can do effortlessly. However, the
state of technology is far from achieving such a goal. In this thesis different components of
a computer vision system that are designed for the task of interpreting man-made scenes,
in particular images of buildings, are described. The flow of information in the proposed
system is bottom-up i.e., the image is first segmented into its meaningful components and
subsequently the regions are labelled using a contextual classifier.
Starting from simple observations concerning the human vision system and the gestalt laws
of human perception, like the law of “good (simple) shape” and “perceptual grouping”, a
blob detector is developed, that identifies components in a 2D image. These components
are convex regions of interest, with interest being defined as significant gradient magnitude
content. An eye tracking experiment is conducted, which shows that the regions identified
by the blob detector, correlate significantly with the regions which drive the attention of
viewers.
Having identified these blobs, it is postulated that a blob represents an object, linguistically
identified with its own semantic name. In other words, a blob may contain a window a
door or a chimney in a building. These regions are used to identify and segment higher
order structures in a building, like facade, window array and also environmental regions
like sky and ground.
Because of inconsistency in the unary features of buildings, a contextual learning algorithm
is used to classify the segmented regions. A model which learns spatial and topological
relationships between different objects from a set of hand-labelled data, is used. This
model utilises this information in a MRF to achieve consistent labellings of new scenes
Visibility recovery on images acquired in attenuating media. Application to underwater, fog, and mammographic imaging
136 p.When acquired in attenuating media, digital images of ten suffer from a particularly complex degradation that reduces their visual quality, hindering their suitability for further computational applications, or simply decreasing the visual pleasan tness for the user. In these cases, mathematical image processing reveals it self as an ideal tool to recover some of the information lost during the degradation process. In this dissertation,we deal with three of such practical scenarios in which this problematic is specially relevant, namely, underwater image enhancement, fogremoval and mammographic image processing. In the case of digital mammograms,X-ray beams traverse human tissue, and electronic detectorscapture them as they reach the other side. However, the superposition on a bidimensional image of three-dimensional structures produces low contraste dimages in which structures of interest suffer from a diminished visibility, obstructing diagnosis tasks. Regarding fog removal, the loss of contrast is produced by the atmospheric conditions, and white colour takes over the scene uniformly as distance increases, also reducing visibility.For underwater images, there is an added difficulty, since colour is not lost uniformly; instead, red colours decay the fastest, and green and blue colours typically dominate the acquired images. To address all these challenges,in this dissertation we develop new methodologies that rely on: a)physical models of the observed degradation, and b) the calculus of variations.Equipped with this powerful machinery, we design novel theoreticaland computational tools, including image-dependent functional energies that capture the particularities of each degradation model. These energie sare composed of different integral terms that are simultaneous lyminimized by means of efficient numerical schemes, producing a clean,visually-pleasant and use ful output image, with better contrast and increased visibility. In every considered application, we provide comprehensive qualitative (visual) and quantitative experimental results to validateour methods, confirming that the developed techniques out perform other existing approaches in the literature
Learning Enriched Features for Real Image Restoration and Enhancement
With the goal of recovering high-quality image content from its degraded
version, image restoration enjoys numerous applications, such as in
surveillance, computational photography, medical imaging, and remote sensing.
Recently, convolutional neural networks (CNNs) have achieved dramatic
improvements over conventional approaches for image restoration task. Existing
CNN-based methods typically operate either on full-resolution or on
progressively low-resolution representations. In the former case, spatially
precise but contextually less robust results are achieved, while in the latter
case, semantically reliable but spatially less accurate outputs are generated.
In this paper, we present a novel architecture with the collective goals of
maintaining spatially-precise high-resolution representations through the
entire network and receiving strong contextual information from the
low-resolution representations. The core of our approach is a multi-scale
residual block containing several key elements: (a) parallel multi-resolution
convolution streams for extracting multi-scale features, (b) information
exchange across the multi-resolution streams, (c) spatial and channel attention
mechanisms for capturing contextual information, and (d) attention based
multi-scale feature aggregation. In a nutshell, our approach learns an enriched
set of features that combines contextual information from multiple scales,
while simultaneously preserving the high-resolution spatial details. Extensive
experiments on five real image benchmark datasets demonstrate that our method,
named as MIRNet, achieves state-of-the-art results for a variety of image
processing tasks, including image denoising, super-resolution, and image
enhancement. The source code and pre-trained models are available at
https://github.com/swz30/MIRNet.Comment: Accepted for publication at ECCV 202
CAD system for lung nodule analysis.
Lung cancer is the deadliest type of known cancer in the United States, claiming hundreds of thousands of lives each year. However, despite the high mortality rate, the 5-year survival rate after resection of Stage 1A non–small cell lung cancer is currently in the range of 62%– 82% and in recent studies even 90%. Patient survival is highly correlated with early detection. Computed Tomography (CT) technology services the early detection of lung cancer tremendously by offering a minimally invasive medical diagnostic tool. Some early types of lung cancer begin with a small mass of tissue within the lung, less than 3 cm in diameter, called a nodule. Most nodules found in a lung are benign, but a small population of them becomes malignant over time. Expert analysis of CT scans is the first step in determining whether a nodule presents a possibility for malignancy but, due to such low spatial support, many potentially harmful nodules go undetected until other symptoms motivate a more thorough search. Computer Vision and Pattern Recognition techniques can play a significant role in aiding the process of detecting and diagnosing lung nodules. This thesis outlines the development of a CAD system which, given an input CT scan, provides a functional and fast, second-opinion diagnosis to physicians. The entire process of lung nodule screening has been cast as a system, which can be enhanced by modern computing technology, with the hopes of providing a feasible diagnostic tool for clinical use. It should be noted that the proposed CAD system is presented as a tool for experts—not a replacement for them. The primary motivation of this thesis is the design of a system that could act as a catalyst for reducing the mortality rate associated with lung cancer
Visibility Recovery on Images Acquired in Attenuating Media. Application to Underwater, Fog, and Mammographic Imaging
When acquired in attenuating media, digital images often suffer from a
particularly complex degradation that reduces their visual quality, hindering
their suitability for further computational applications, or simply
decreasing the visual pleasantness for the user. In these cases, mathematical
image processing reveals itself as an ideal tool to recover some
of the information lost during the degradation process. In this dissertation,
we deal with three of such practical scenarios in which this problematic
is specially relevant, namely, underwater image enhancement, fog
removal and mammographic image processing. In the case of digital mammograms,
X-ray beams traverse human tissue, and electronic detectors
capture them as they reach the other side. However, the superposition
on a bidimensional image of three-dimensional structures produces lowcontrasted
images in which structures of interest suffer from a diminished
visibility, obstructing diagnosis tasks. Regarding fog removal, the loss
of contrast is produced by the atmospheric conditions, and white colour
takes over the scene uniformly as distance increases, also reducing visibility.
For underwater images, there is an added difficulty, since colour is not
lost uniformly; instead, red colours decay the fastest, and green and blue
colours typically dominate the acquired images. To address all these challenges,
in this dissertation we develop new methodologies that rely on: a)
physical models of the observed degradation, and b) the calculus of variations.
Equipped with this powerful machinery, we design novel theoretical
and computational tools, including image-dependent functional energies
that capture the particularities of each degradation model. These energies
are composed of different integral terms that are simultaneously
minimized by means of efficient numerical schemes, producing a clean,
visually-pleasant and useful output image, with better contrast and increased
visibility. In every considered application, we provide comprehensive
qualitative (visual) and quantitative experimental results to validate
our methods, confirming that the developed techniques outperform other
existing approaches in the literature
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
- …