2,542 research outputs found
Fuzzy classification improvement by a pre-perceptual labelled segmentation algorithm
The goal of this paper is to present how two different image processing approaches can be enhanced by merging both methodologies. We will see how the results of a perceptual labelled segmentation methodology [7] can be improved by applying a fuzzy classification algorithm [2] based on a fuzzy outranking methodology [9] as a postprocessing algorithm, and viceversa. A comparison of the individual algorithms with the combination of both algorithms will be presented in order to demonstrate the improvement. Color Bone Marrow (1) images will be used. The objective is to detect White Blood Cells. The detection of white blood cells in bone marrow microscopic images presents big difficulties because of the great variance in their characteristics and also because of staining and illumination inconsistences. On the other hand, the maturity classes of white blood cells actually represents a continuum; cells frequently overlap each other, and there is a fairly wide variation in size and shape of nucleus and cytoplasm regions within given cell classes
Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze
Unsupervised segmentation of action segments in egocentric videos is a
desirable feature in tasks such as activity recognition and content-based video
retrieval. Reducing the search space into a finite set of action segments
facilitates a faster and less noisy matching. However, there exist a
substantial gap in machine understanding of natural temporal cuts during a
continuous human activity. This work reports on a novel gaze-based approach for
segmenting action segments in videos captured using an egocentric camera. Gaze
is used to locate the region-of-interest inside a frame. By tracking two simple
motion-based parameters inside successive regions-of-interest, we discover a
finite set of temporal cuts. We present several results using combinations (of
the two parameters) on a dataset, i.e., BRISGAZE-ACTIONS. The dataset contains
egocentric videos depicting several daily-living activities. The quality of the
temporal cuts is further improved by implementing two entropy measures.Comment: To appear in 2017 IEEE International Conference On Signal and Image
Processing Application
Fair comparison of skin detection approaches on publicly available datasets
Skin detection is the process of discriminating skin and non-skin regions in
a digital image and it is widely used in several applications ranging from hand
gesture analysis to track body parts and face detection. Skin detection is a
challenging problem which has drawn extensive attention from the research
community, nevertheless a fair comparison among approaches is very difficult
due to the lack of a common benchmark and a unified testing protocol. In this
work, we investigate the most recent researches in this field and we propose a
fair comparison among approaches using several different datasets. The major
contributions of this work are an exhaustive literature review of skin color
detection approaches, a framework to evaluate and combine different skin
detector approaches, whose source code is made freely available for future
research, and an extensive experimental comparison among several recent methods
which have also been used to define an ensemble that works well in many
different problems. Experiments are carried out in 10 different datasets
including more than 10000 labelled images: experimental results confirm that
the best method here proposed obtains a very good performance with respect to
other stand-alone approaches, without requiring ad hoc parameter tuning. A
MATLAB version of the framework for testing and of the methods proposed in this
paper will be freely available from https://github.com/LorisNann
Detecting the presence of large buildings in natural images
This paper addresses the issue of classification of lowlevel
features into high-level semantic concepts for the purpose of semantic annotation of consumer photographs. We adopt a multi-scale approach that relies on edge detection to extract an edge orientation-based feature description of the image, and apply an SVM learning technique to infer the presence of a dominant building object in a general purpose collection of digital photographs. The approach exploits prior knowledge on the image context through an assumption that all input images are �outdoor�, i.e. indoor/outdoor classification (the context determination stage) has been performed. The proposed approach is validated on a diverse dataset of 1720 images and its performance compared with that of the MPEG-7 edge histogram descriptor
Pattern classification approaches for breast cancer identification via MRI: state‐of‐the‐art and vision for the future
Mining algorithms for Dynamic Contrast Enhanced Magnetic Resonance Imaging (DCEMRI)
of breast tissue are discussed. The algorithms are based on recent advances in multidimensional
signal processing and aim to advance current state‐of‐the‐art computer‐aided detection
and analysis of breast tumours when these are observed at various states of development. The topics
discussed include image feature extraction, information fusion using radiomics, multi‐parametric
computer‐aided classification and diagnosis using information fusion of tensorial datasets as well
as Clifford algebra based classification approaches and convolutional neural network deep learning
methodologies. The discussion also extends to semi‐supervised deep learning and self‐supervised
strategies as well as generative adversarial networks and algorithms using generated
confrontational learning approaches. In order to address the problem of weakly labelled tumour
images, generative adversarial deep learning strategies are considered for the classification of
different tumour types. The proposed data fusion approaches provide a novel Artificial Intelligence
(AI) based framework for more robust image registration that can potentially advance the early
identification of heterogeneous tumour types, even when the associated imaged organs are
registered as separate entities embedded in more complex geometric spaces. Finally, the general
structure of a high‐dimensional medical imaging analysis platform that is based on multi‐task
detection and learning is proposed as a way forward. The proposed algorithm makes use of novel
loss functions that form the building blocks for a generated confrontation learning methodology
that can be used for tensorial DCE‐MRI. Since some of the approaches discussed are also based on
time‐lapse imaging, conclusions on the rate of proliferation of the disease can be made possible. The
proposed framework can potentially reduce the costs associated with the interpretation of medical
images by providing automated, faster and more consistent diagnosis
Highly efficient low-level feature extraction for video representation and retrieval.
PhDWitnessing the omnipresence of digital video media, the research community has
raised the question of its meaningful use and management. Stored in immense
multimedia databases, digital videos need to be retrieved and structured in an
intelligent way, relying on the content and the rich semantics involved. Current
Content Based Video Indexing and Retrieval systems face the problem of the semantic
gap between the simplicity of the available visual features and the richness of user
semantics.
This work focuses on the issues of efficiency and scalability in video indexing and
retrieval to facilitate a video representation model capable of semantic annotation. A
highly efficient algorithm for temporal analysis and key-frame extraction is developed.
It is based on the prediction information extracted directly from the compressed domain
features and the robust scalable analysis in the temporal domain. Furthermore,
a hierarchical quantisation of the colour features in the descriptor space is presented.
Derived from the extracted set of low-level features, a video representation model that
enables semantic annotation and contextual genre classification is designed.
Results demonstrate the efficiency and robustness of the temporal analysis algorithm
that runs in real time maintaining the high precision and recall of the detection task.
Adaptive key-frame extraction and summarisation achieve a good overview of the
visual content, while the colour quantisation algorithm efficiently creates hierarchical
set of descriptors. Finally, the video representation model, supported by the genre
classification algorithm, achieves excellent results in an automatic annotation system by
linking the video clips with a limited lexicon of related keywords
Image synthesis based on a model of human vision
Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading.
However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer.
This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach.
A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures.
A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering.
This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision
Image Quality Improvement of Medical Images using Deep Learning for Computer-aided Diagnosis
Retina image analysis is an important screening tool for early detection of multiple dis eases such as diabetic retinopathy which greatly impairs visual function. Image analy sis and pathology detection can be accomplished both by ophthalmologists and by the
use of computer-aided diagnosis systems. Advancements in hardware technology led to
more portable and less expensive imaging devices for medical image acquisition. This
promotes large scale remote diagnosis by clinicians as well as the implementation of
computer-aided diagnosis systems for local routine disease screening. However, lower cost equipment generally results in inferior quality images. This may jeopardize the
reliability of the acquired images and thus hinder the overall performance of the diagnos tic tool. To solve this open challenge, we carried out an in-depth study on using different
deep learning-based frameworks for improving retina image quality while maintaining
the underlying morphological information for the diagnosis. Our results demonstrate
that using a Cycle Generative Adversarial Network for unpaired image-to-image trans lation leads to successful transformations of retina images from a low- to a high-quality
domain. The visual evidence of this improvement was quantitatively affirmed by the two
proposed validation methods. The first used a retina image quality classifier to confirm a
significant prediction label shift towards a quality enhance. On average, a 50% increase
of images being classified as high-quality was verified. The second analysed the perfor mance modifications of a diabetic retinopathy detection algorithm upon being trained
with the quality-improved images. The latter led to strong evidence that the proposed
solution satisfies the requirement of maintaining the images’ original information for
diagnosis, and that it assures a pathology-assessment more sensitive to the presence of
pathological signs. These experimental results confirm the potential effectiveness of our
solution in improving retina image quality for diagnosis. Along with the addressed con tributions, we analysed how the construction of the data sets representing the low-quality
domain impacts the quality translation efficiency. Our findings suggest that by tackling
the problem more selectively, that is, constructing data sets more homogeneous in terms
of their image defects, we can obtain more accentuated quality transformations
- …