148,838 research outputs found
Detection of curved lines with B-COSFIRE filters: A case study on crack delineation
The detection of curvilinear structures is an important step for various
computer vision applications, ranging from medical image analysis for
segmentation of blood vessels, to remote sensing for the identification of
roads and rivers, and to biometrics and robotics, among others. %The visual
system of the brain has remarkable abilities to detect curvilinear structures
in noisy images. This is a nontrivial task especially for the detection of thin
or incomplete curvilinear structures surrounded with noise. We propose a
general purpose curvilinear structure detector that uses the brain-inspired
trainable B-COSFIRE filters. It consists of four main steps, namely nonlinear
filtering with B-COSFIRE, thinning with non-maximum suppression, hysteresis
thresholding and morphological closing. We demonstrate its effectiveness on a
data set of noisy images with cracked pavements, where we achieve
state-of-the-art results (F-measure=0.865). The proposed method can be employed
in any computer vision methodology that requires the delineation of curvilinear
and elongated structures.Comment: Accepted at Computer Analysis of Images and Patterns (CAIP) 201
Cell-based approach for 3D reconstruction from incomplete silhouettes
Shape-from-silhouettes is a widely adopted approach to compute accurate 3D reconstructions of people or objects in a multi-camera environment. However, such algorithms are traditionally very sensitive to errors in the silhouettes due to imperfect foreground-background estimation or occluding objects appearing in front of the object of interest. We propose a novel algorithm that is able to still provide high quality reconstruction from incomplete silhouettes. At the core of the method is the partitioning of reconstruction space in cells, i.e. regions with uniform camera and silhouette coverage properties. A set of rules is proposed to iteratively add cells to the reconstruction based on their potential to explain discrepancies between silhouettes in different cameras. Experimental analysis shows significantly improved F1-scores over standard leave-M-out reconstruction techniques
Map Generation from Large Scale Incomplete and Inaccurate Data Labels
Accurately and globally mapping human infrastructure is an important and
challenging task with applications in routing, regulation compliance
monitoring, and natural disaster response management etc.. In this paper we
present progress in developing an algorithmic pipeline and distributed compute
system that automates the process of map creation using high resolution aerial
images. Unlike previous studies, most of which use datasets that are available
only in a few cities across the world, we utilizes publicly available imagery
and map data, both of which cover the contiguous United States (CONUS). We
approach the technical challenge of inaccurate and incomplete training data
adopting state-of-the-art convolutional neural network architectures such as
the U-Net and the CycleGAN to incrementally generate maps with increasingly
more accurate and more complete labels of man-made infrastructure such as roads
and houses. Since scaling the mapping task to CONUS calls for parallelization,
we then adopted an asynchronous distributed stochastic parallel gradient
descent training scheme to distribute the computational workload onto a cluster
of GPUs with nearly linear speed-up.Comment: This paper is accepted by KDD 202
Finding Faces in Cluttered Scenes using Random Labeled Graph Matching
An algorithm for locating quasi-frontal views of human faces in cluttered scenes is presented. The algorithm works by coupling a set of local feature detectors with a statistical model of the mutual distances between facial features it is invariant with respect to translation, rotation (in the plane), and scale and can handle partial occlusions of the face. On a challenging database with complicated and varied backgrounds, the algorithm achieved a correct localization rate of 95% in images where the face appeared quasi-frontally
Ventral-stream-like shape representation : from pixel intensity values to trainable object-selective COSFIRE models
Keywords: hierarchical representation, object recognition, shape, ventral stream, vision and scene understanding, robotics, handwriting analysisThe remarkable abilities of the primate visual system have inspired the construction of computational models of some visual neurons. We propose a trainable hierarchical object recognition model, which we call S-COSFIRE (S stands for Shape and COSFIRE stands for Combination Of Shifted FIlter REsponses) and use it to localize and recognize objects of interests embedded in complex scenes. It is inspired by the visual processing in the ventral stream (V1/V2 â V4 â TEO). Recognition and localization of objects embedded in complex scenes is important for many computer vision applications. Most existing methods require prior segmentation of the objects from the background which on its turn requires recognition.
An S-COSFIRE ïŹlter is automatically conïŹgured to be selective for an arrangement of contour-based features that belong to a prototype shape speciïŹed by an example. The conïŹguration comprises selecting relevant vertex detectors and determining certain blur and shift parameters. The response is computed as the weighted geometric mean of the blurred and shifted responses of the selected vertex detectors. S-COSFIRE ïŹlters share similar properties with some neurons in inferotemporal cortex, which provided inspiration for this work.
We demonstrate the effectiveness of S-COSFIRE filters in two applications: letter and keyword spotting in handwritten manuscripts and object spotting in complex scenes for the computer vision system of a domestic robot.
S-COSFIRE ïŹlters are effective to recognize and localize (deformable) objects in images of complex scenes without requiring prior segmentation. They are versatile trainable shape detectors, conceptually simple and easy to implement. The presented hierarchical shape representation contributes to a better understanding of the brain and to more robust computer vision algorithms.peer-reviewe
- âŠ