6,086 research outputs found
Face analysis using curve edge maps
This paper proposes an automatic and real-time system for face analysis, usable in visual communication applications. In this approach, faces are represented with Curve Edge Maps, which are collections of polynomial segments with a convex region. The segments are extracted from edge pixels using an adaptive incremental linear-time fitting algorithm, which is based on constructive polynomial fitting. The face analysis system considers face tracking, face recognition and facial feature detection, using Curve Edge Maps driven by histograms of intensities and histograms of relative positions. When applied to different face databases and video sequences, the average face recognition rate is 95.51%, the average facial feature detection rate is 91.92% and the accuracy in location of the facial features is 2.18% in terms of the size of the face, which is comparable with or better than the results in literature. However, our method has the advantages of simplicity, real-time performance and extensibility to the different aspects of face analysis, such as recognition of facial expressions and talking
ANSIG - An Analytic Signature for Arbitrary 2D Shapes (or Bags of Unlabeled Points)
In image analysis, many tasks require representing two-dimensional (2D)
shape, often specified by a set of 2D points, for comparison purposes. The
challenge of the representation is that it must not only capture the
characteristics of the shape but also be invariant to relevant transformations.
Invariance to geometric transformations, such as translation, rotation, and
scale, has received attention in the past, usually under the assumption that
the points are previously labeled, i.e., that the shape is characterized by an
ordered set of landmarks. However, in many practical scenarios, the points
describing the shape are obtained from automatic processes, e.g., edge or
corner detection, thus without labels or natural ordering. Obviously, the
combinatorial problem of computing the correspondences between the points of
two shapes in the presence of the aforementioned geometrical distortions
becomes a quagmire when the number of points is large. We circumvent this
problem by representing shapes in a way that is invariant to the permutation of
the landmarks, i.e., we represent bags of unlabeled 2D points. Within our
framework, a shape is mapped to an analytic function on the complex plane,
leading to what we call its analytic signature (ANSIG). To store an ANSIG, it
suffices to sample it along a closed contour in the complex plane. We show that
the ANSIG is a maximal invariant with respect to the permutation group, i.e.,
that different shapes have different ANSIGs and shapes that differ by a
permutation (or re-labeling) of the landmarks have the same ANSIG. We further
show how easy it is to factor out geometric transformations when comparing
shapes using the ANSIG representation. Finally, we illustrate these
capabilities with shape-based image classification experiments
AMAT: Medial Axis Transform for Natural Images
We introduce Appearance-MAT (AMAT), a generalization of the medial axis
transform for natural images, that is framed as a weighted geometric set cover
problem. We make the following contributions: i) we extend previous medial
point detection methods for color images, by associating each medial point with
a local scale; ii) inspired by the invertibility property of the binary MAT, we
also associate each medial point with a local encoding that allows us to invert
the AMAT, reconstructing the input image; iii) we describe a clustering scheme
that takes advantage of the additional scale and appearance information to
group individual points into medial branches, providing a shape decomposition
of the underlying image regions. In our experiments, we show state-of-the-art
performance in medial point detection on Berkeley Medial AXes (BMAX500), a new
dataset of medial axes based on the BSDS500 database, and good generalization
on the SK506 and WH-SYMMAX datasets. We also measure the quality of
reconstructed images from BMAX500, obtained by inverting their computed AMAT.
Our approach delivers significantly better reconstruction quality with respect
to three baselines, using just 10% of the image pixels. Our code and
annotations are available at https://github.com/tsogkas/amat .Comment: 10 pages (including references), 5 figures, accepted at ICCV 201
A Novel Framework for Highlight Reflectance Transformation Imaging
We propose a novel pipeline and related software tools for processing the multi-light image collections (MLICs) acquired in different application contexts to obtain shape and appearance information of captured surfaces, as well as to derive compact relightable representations of them. Our pipeline extends the popular Highlight Reflectance Transformation Imaging (H-RTI) framework, which is widely used in the Cultural Heritage domain. We support, in particular, perspective camera modeling, per-pixel interpolated light direction estimation, as well as light normalization correcting vignetting and uneven non-directional illumination. Furthermore, we propose two novel easy-to-use software tools to simplify all processing steps. The tools, in addition to support easy processing and encoding of pixel data, implement a variety of visualizations, as well as multiple reflectance-model-fitting options. Experimental tests on synthetic and real-world MLICs demonstrate the usefulness of the novel algorithmic framework and the potential benefits of the proposed tools for end-user applications.Terms: "European Union (EU)" & "Horizon 2020" / Action: H2020-EU.3.6.3. - Reflective societies - cultural heritage and European identity / Acronym: Scan4Reco / Grant number: 665091DSURF project (PRIN 2015) funded by the Italian Ministry of University and ResearchSardinian Regional Authorities under projects VIGEC and Vis&VideoLa
Image Colour Segmentation by Genetic Algorithms
Segmentation of a colour image composed of different kinds of texture regions
can be a hard problem, namely to compute for an exact texture fields and a
decision of the optimum number of segmentation areas in an image when it
contains similar and/or unstationary texture fields. In this work, a method is
described for evolving adaptive procedures for these problems. In many real
world applications data clustering constitutes a fundamental issue whenever
behavioural or feature domains can be mapped into topological domains. We
formulate the segmentation problem upon such images as an optimisation problem
and adopt evolutionary strategy of Genetic Algorithms for the clustering of
small regions in colour feature space. The present approach uses k-Means
unsupervised clustering methods into Genetic Algorithms, namely for guiding
this last Evolutionary Algorithm in his search for finding the optimal or
sub-optimal data partition, task that as we know, requires a non-trivial search
because of its intrinsic NP-complete nature. To solve this task, the
appropriate genetic coding is also discussed, since this is a key aspect in the
implementation. Our purpose is to demonstrate the efficiency of Genetic
Algorithms to automatic and unsupervised texture segmentation. Some examples in
Colour Maps, Ornamental Stones and in Human Skin Mark segmentation are
presented and overall results discussed. KEYWORDS: Genetic Algorithms, Colour
Image Segmentation, Classification, Clustering.Comment: 5 pages, 1 figure, Author at
http://alfa.ist.utl.pt/~cvrm/staff/vramos/ref_26.htm
Entropy-difference based stereo error detection
Stereo depth estimation is error-prone; hence, effective error detection
methods are desirable. Most such existing methods depend on characteristics of
the stereo matching cost curve, making them unduly dependent on functional
details of the matching algorithm. As a remedy, we propose a novel error
detection approach based solely on the input image and its depth map. Our
assumption is that, entropy of any point on an image will be significantly
higher than the entropy of its corresponding point on the image's depth map. In
this paper, we propose a confidence measure, Entropy-Difference (ED) for stereo
depth estimates and a binary classification method to identify incorrect
depths. Experiments on the Middlebury dataset show the effectiveness of our
method. Our proposed stereo confidence measure outperforms 17 existing measures
in all aspects except occlusion detection. Established metrics such as
precision, accuracy, recall, and area-under-curve are used to demonstrate the
effectiveness of our method
A Methodological Review of Visual Road Recognition Procedures for Autonomous Driving Applications
The current research interest in autonomous driving is growing at a rapid
pace, attracting great investments from both the academic and corporate
sectors. In order for vehicles to be fully autonomous, it is imperative that
the driver assistance system is adapt in road and lane keeping. In this paper,
we present a methodological review of techniques with a focus on visual road
detection and recognition. We adopt a pragmatic outlook in presenting this
review, whereby the procedures of road recognition is emphasised with respect
to its practical implementations. The contribution of this review hence covers
the topic in two parts -- the first part describes the methodological approach
to conventional road detection, which covers the algorithms and approaches
involved to classify and segregate roads from non-road regions; and the other
part focuses on recent state-of-the-art machine learning techniques that are
applied to visual road recognition, with an emphasis on methods that
incorporate convolutional neural networks and semantic segmentation. A
subsequent overview of recent implementations in the commercial sector is also
presented, along with some recent research works pertaining to road detections.Comment: 14 pages, 6 Figures, 2 Tables. Permission to reprint granted from
original figure author
3D Face Synthesis with KINECT
This work describes the process of face synthesis by image morphing from less expensive 3D sensors such as KINECT that are prone to sensor noise. Its main aim is to create a useful face database for future face recognition studies.Peer reviewe
Learning to Refine Object Contours with a Top-Down Fully Convolutional Encoder-Decoder Network
We develop a novel deep contour detection algorithm with a top-down fully
convolutional encoder-decoder network. Our proposed method, named TD-CEDN,
solves two important issues in this low-level vision problem: (1) learning
multi-scale and multi-level features; and (2) applying an effective top-down
refined approach in the networks. TD-CEDN performs the pixel-wise prediction by
means of leveraging features at all layers of the net. Unlike skip connections
and previous encoder-decoder methods, we first learn a coarse feature map after
the encoder stage in a feedforward pass, and then refine this feature map in a
top-down strategy during the decoder stage utilizing features at successively
lower layers. Therefore, the deconvolutional process is conducted stepwise,
which is guided by Deeply-Supervision Net providing the integrated direct
supervision. The above proposed technologies lead to a more precise and clearer
prediction. Our proposed algorithm achieved the state-of-the-art on the BSDS500
dataset (ODS F-score of 0.788), the PASCAL VOC2012 dataset (ODS F-score of
0.588), and and the NYU Depth dataset (ODS F-score of 0.735).Comment: 12 pages, 13 figure
Multi-radial LBP Features as a Tool for Rapid Glomerular Detection and Assessment in Whole Slide Histopathology Images
We demonstrate a simple and effective automated method for the segmentation
of glomeruli from large (~1 gigapixel) histopathological whole-slide images
(WSIs) of thin renal tissue sections and biopsies, using an adaptation of the
well-known local binary patterns (LBP) image feature vector to train a support
vector machine (SVM) model. Our method offers high precision (>90%) and
reasonable recall (>70%) for glomeruli from WSIs, is readily adaptable to
glomeruli from multiple species, including mouse, rat, and human, and is robust
to diverse slide staining methods. Using 5 Intel(R) Core(TM) i7-4790 CPUs with
40 GB RAM, our method typically requires ~15 sec for training and ~2 min to
extract glomeruli reproducibly from a WSI. Deploying a deep convolutional
neural network trained for glomerular recognition in tandem with the SVM
suffices to reduce false positives to below 3%. We also apply our LBP-based
descriptor to successfully detect pathologic changes in a mouse model of
diabetic nephropathy. We envision potential clinical and laboratory
applications for this approach in the study and diagnosis of glomerular
disease, and as a means of greatly accelerating the construction of feature
sets to fuel deep learning studies into tissue structure and pathology.Comment: 14 pages, 6 figures. Added scalebars, and for Fig. 3b a clearer
example of medulla remova
- …