8,194 research outputs found
Connectivity-Enforcing Hough Transform for the Robust Extraction of Line Segments
Global voting schemes based on the Hough transform (HT) have been widely used
to robustly detect lines in images. However, since the votes do not take line
connectivity into account, these methods do not deal well with cluttered
images. In opposition, the so-called local methods enforce connectivity but
lack robustness to deal with challenging situations that occur in many
realistic scenarios, e.g., when line segments cross or when long segments are
corrupted. In this paper, we address the critical limitations of the HT as a
line segment extractor by incorporating connectivity in the voting process.
This is done by only accounting for the contributions of edge points lying in
increasingly larger neighborhoods and whose position and directional content
agree with potential line segments. As a result, our method, which we call
STRAIGHT (Segment exTRAction by connectivity-enforcInG HT), extracts the
longest connected segments in each location of the image, thus also integrating
into the HT voting process the usually separate step of individual segment
extraction. The usage of the Hough space mapping and a corresponding
hierarchical implementation make our approach computationally feasible. We
present experiments that illustrate, with synthetic and real images, how
STRAIGHT succeeds in extracting complete segments in several situations where
current methods fail.Comment: Submitted for publicatio
A smart environment for biometric capture
The development of large scale biometric systems require experiments to be performed on large amounts of data. Existing capture systems are designed for fixed experiments and are not easily scalable. In this scenario even the addition of extra data is difficult. We developed a prototype biometric tunnel for the capture of non-contact biometrics. It is self contained and autonomous. Such a configuration is ideal for building access or deployment in secure environments. The tunnel captures cropped images of the subject's face and performs a 3D reconstruction of the person's motion which is used to extract gait information. Interaction between the various parts of the system is performed via the use of an agent framework. The design of this system is a trade-off between parallel and serial processing due to various hardware bottlenecks. When tested on a small population the extracted features have been shown to be potent for recognition. We currently achieve a moderate throughput of approximate 15 subjects an hour and hope to improve this in the future as the prototype becomes more complete
Camera distortion self-calibration using the plumb-line constraint and minimal Hough entropy
In this paper we present a simple and robust method for self-correction of
camera distortion using single images of scenes which contain straight lines.
Since the most common distortion can be modelled as radial distortion, we
illustrate the method using the Harris radial distortion model, but the method
is applicable to any distortion model. The method is based on transforming the
edgels of the distorted image to a 1-D angular Hough space, and optimizing the
distortion correction parameters which minimize the entropy of the
corresponding normalized histogram. Properly corrected imagery will have fewer
curved lines, and therefore less spread in Hough space. Since the method does
not rely on any image structure beyond the existence of edgels sharing some
common orientations and does not use edge fitting, it is applicable to a wide
variety of image types. For instance, it can be applied equally well to images
of texture with weak but dominant orientations, or images with strong vanishing
points. Finally, the method is performed on both synthetic and real data
revealing that it is particularly robust to noise.Comment: 9 pages, 5 figures Corrected errors in equation 1
Semantic Cross-View Matching
Matching cross-view images is challenging because the appearance and
viewpoints are significantly different. While low-level features based on
gradient orientations or filter responses can drastically vary with such
changes in viewpoint, semantic information of images however shows an invariant
characteristic in this respect. Consequently, semantically labeled regions can
be used for performing cross-view matching. In this paper, we therefore explore
this idea and propose an automatic method for detecting and representing the
semantic information of an RGB image with the goal of performing cross-view
matching with a (non-RGB) geographic information system (GIS). A segmented
image forms the input to our system with segments assigned to semantic concepts
such as traffic signs, lakes, roads, foliage, etc. We design a descriptor to
robustly capture both, the presence of semantic concepts and the spatial layout
of those segments. Pairwise distances between the descriptors extracted from
the GIS map and the query image are then used to generate a shortlist of the
most promising locations with similar semantic concepts in a consistent spatial
layout. An experimental evaluation with challenging query images and a large
urban area shows promising results
Two View Line-Based Motion and Structure Estimation for Planar Scenes
We present an algorithm for reconstruction of piece-wise planar scenes from only two views and based on minimum line correspondences. We first recover camera rotation by matching vanishing points based on the methods already exist in the literature and then recover the camera translation by searching among a family of hypothesized planes passing through one line. Unlike algorithms based on line segments, the presented algorithm does not require an overlap between two line segments or more that one line correspondence across more than two views to recover the translation and achieves the goal by exploiting photometric constraints of the surface around the line. Experimental results on real images prove the functionality of the algorithm
AUTOMATIC IMAGE TO MODEL ALIGNMENT FOR PHOTO-REALISTIC URBAN MODEL RECONSTRUCTION
We introduce a hybrid approach in which images of an urban scene are automatically alignedwith a base geometry of the scene to determine model-relative external camera parameters. Thealgorithm takes as input a model of the scene and images with approximate external cameraparameters and aligns the images to the model by extracting the facades from the images andaligning the facades with the model by minimizing over a multivariate objective function. Theresulting image-pose pairs can be used to render photo-realistic views of the model via texturemapping.Several natural extensions to the base hybrid reconstruction technique are also introduced. Theseextensions, which include vanishing point based calibration refinement and video stream basedreconstruction, increase the accuracy of the base algorithm, reduce the amount of data that mustbe provided by the user as input to the algorithm, and provide a mechanism for automaticallycalibrating a large set of images for post processing steps such as automatic model enhancementand fly-through model visualization.Traditionally, photo-realistic urban reconstruction has been approached from purely image-basedor model-based approaches. Recently, research has been conducted on hybrid approaches, whichcombine the use of images and models. Such approaches typically require user assistance forcamera calibration. Our approach is an improvement over these methods because it does notrequire user assistance for camera calibration
- …