1,357 research outputs found
Shape measure for identifying perceptually informative parts of 3d objects
We propose a mathematical approach for quantifying shape complexity of 3D surfaces based on perceptual principles of visual saliency. Our curvature variation measure (CVM), as a 3D feature, combines surface curvature and information theory by leveraging bandwidth-optimized kernel density estimators. Using a part decomposition algorithm for digitized 3D objects, represented as triangle meshes, we apply our shape measure to transform the low level mesh representation into a perceptually informative form. Further, we analyze the effects of noise, sensitivity to digitization, occlusions, and descriptiveness to demonstrate our shape measure on laser-scanned real world 3D objects. 1
Geometric and photometric affine invariant image registration
This thesis aims to present a solution to the correspondence problem for the registration
of wide-baseline images taken from uncalibrated cameras. We propose an affine
invariant descriptor that combines the geometry and photometry of the scene to find
correspondences between both views. The geometric affine invariant component of the
descriptor is based on the affine arc-length metric, whereas the photometry is analysed
by invariant colour moments. A graph structure represents the spatial distribution of the
primitive features; i.e. nodes correspond to detected high-curvature points, whereas arcs
represent connectivities by extracted contours. After matching, we refine the search for
correspondences by using a maximum likelihood robust algorithm. We have evaluated
the system over synthetic and real data. The method is endemic to propagation of errors
introduced by approximations in the system.BAE SystemsSelex Sensors and Airborne System
Fast visual recognition of large object sets
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1990.Includes bibliographical references (leaves 117-123).by Michael Joseph Villalba.Ph.D
Investigating human-perceptual properties of "shapes" using 3D shapes and 2D fonts
Shapes are generally used to convey meaning. They are used in video games, films and other multimedia, in diverse ways. 3D shapes may be destined for virtual scenes or represent objects to be constructed in the real-world. Fonts add character to an otherwise plain block of text, allowing the writer to make important points more visually prominent or distinct from other text. They can indicate the structure of a document, at a glance. Rather than studying shapes through traditional geometric shape descriptors, we provide alternative methods to describe and analyse shapes, from a lens of human perception. This is done via the concepts of Schelling Points and Image Specificity. Schelling Points are choices people make when they aim to match with what they expect others to choose but cannot communicate with others to determine an answer. We study whole mesh selections in this setting, where Schelling Meshes are the most frequently selected shapes. The key idea behind image Specificity is that different images evoke different descriptions; but ‘Specific’ images yield more consistent descriptions than others. We apply Specificity to 2D fonts. We show that each concept can be learned and predict them for fonts and 3D shapes, respectively, using a depth image-based convolutional neural network. Results are shown for a range of fonts and 3D shapes and we demonstrate that font Specificity and the Schelling meshes concept are useful for visualisation, clustering, and search applications. Overall, we find that each concept represents similarities between their respective type of shape, even when there are discontinuities between the shape geometries themselves. The ‘context’ of these similarities is in some kind of abstract or subjective meaning which is consistent among different people
Local, Semi-Local and Global Models for Texture, Object and Scene Recognition
This dissertation addresses the problems of recognizing textures, objects, and scenes in photographs. We present approaches to these recognition tasks that combine salient local image features with spatial relations and effective discriminative learning techniques. First, we introduce a bag of features image model for recognizing textured surfaces under a wide range of transformations, including viewpoint changes and non-rigid deformations. We present results of a large-scale comparative evaluation indicating that bags of features can be effective not only for texture, but also for object categization, even in the presence of substantial clutter and intra-class variation. We also show how to augment the purely local image representation with statistical co-occurrence relations between pairs of nearby features, and develop a learning and classification framework for the task of classifying individual features in a multi-texture image. Next, we present a more structured alternative to bags of features for object recognition, namely, an image representation based on semi-local parts, or groups of features characterized by stable appearance and geometric layout. Semi-local parts are automatically learned from small sets of unsegmented, cluttered images. Finally, we present a global method for recognizing scene categories that works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting spatial pyramid representation demonstrates significantly improved performance on challenging scene categorization tasks
Rotational Projection Statistics for 3D Local Surface Description and Object Recognition
Recognizing 3D objects in the presence of noise, varying mesh resolution,
occlusion and clutter is a very challenging task. This paper presents a novel
method named Rotational Projection Statistics (RoPS). It has three major
modules: Local Reference Frame (LRF) definition, RoPS feature description and
3D object recognition. We propose a novel technique to define the LRF by
calculating the scatter matrix of all points lying on the local surface. RoPS
feature descriptors are obtained by rotationally projecting the neighboring
points of a feature point onto 2D planes and calculating a set of statistics
(including low-order central moments and entropy) of the distribution of these
projected points. Using the proposed LRF and RoPS descriptor, we present a
hierarchical 3D object recognition algorithm. The performance of the proposed
LRF, RoPS descriptor and object recognition algorithm was rigorously tested on
a number of popular and publicly available datasets. Our proposed techniques
exhibited superior performance compared to existing techniques. We also showed
that our method is robust with respect to noise and varying mesh resolution.
Our RoPS based algorithm achieved recognition rates of 100%, 98.9%, 95.4% and
96.0% respectively when tested on the Bologna, UWA, Queen's and Ca' Foscari
Venezia Datasets.Comment: The final publication is available at link.springer.com International
Journal of Computer Vision 201
Surface analysis and fingerprint recognition from multi-light imaging collections
Multi-light imaging captures a scene from a fixed viewpoint through multiple photographs, each of which are illuminated from a different direction. Every image reveals information about the surface, with the intensity reflected from each point being measured for all lighting directions. The images captured are known as multi-light image collections (MLICs), for which a variety of techniques have been developed over recent decades to acquire information from the images. These techniques include shape from shading, photometric stereo and reflectance transformation imaging (RTI). Pixel coordinates from one image in a MLIC will correspond to exactly the same position on the surface across all images in the MLIC since the camera does not move.
We assess the relevant literature to the methods presented in this thesis in chapter 1 and describe different types of reflections and surface types, as well as explaining the multi-light imaging process. In chapter 2 we present a novel automated RTI method which requires no calibration equipment (i.e. shiny reference spheres or 3D printed structures as other methods require) and automatically computes the lighting direction and compensates for non-uniform illumination.
Then in chapter 3 we describe our novel MLIC method termed Remote Extraction of Latent Fingerprints (RELF) which segments each multi-light imaging photograph into superpixels (small groups of pixels) and uses a neural network classifier to determine whether or not the superpixel contains fingerprint. The RELF algorithm then mosaics these superpixels which are classified as fingerprint together in order to obtain a complete latent print image, entirely contactlessly.
In chapter 4 we detail our work with the Metropolitan Police Service (MPS) UK, who described to us with their needs and requirements which helped us to create a prototype RELF imaging device which is now being tested by MPS officers who are validating the quality of the latent prints extracted using our technique.
In chapter 5 we then further developed our multi-light imaging latent fingerprint technique to extract latent prints from curved surfaces and automatically correct for surface curvature distortions. We have a patent pending for this method
Segmentation of Fault Networks Determined from Spatial Clustering of Earthquakes
We present a new method of data clustering applied to earthquake catalogs,
with the goal of reconstructing the seismically active part of fault networks.
We first use an original method to separate clustered events from uncorrelated
seismicity using the distribution of volumes of tetrahedra defined by closest
neighbor events in the original and randomized seismic catalogs. The spatial
disorder of the complex geometry of fault networks is then taken into account
by defining faults as probabilistic anisotropic kernels, whose structures are
motivated by properties of discontinuous tectonic deformation and previous
empirical observations of the geometry of faults and of earthquake clusters at
many spatial and temporal scales. Combining this a priori knowledge with
information theoretical arguments, we propose the Gaussian mixture approach
implemented in an Expectation-Maximization (EM) procedure. A cross-validation
scheme is then used and allows the determination of the number of kernels that
should be used to provide an optimal data clustering of the catalog. This
three-steps approach is applied to a high quality relocated catalog of the
seismicity following the 1986 Mount Lewis () event in California and
reveals that events cluster along planar patches of about 2 km, i.e.
comparable to the size of the main event. The finite thickness of those
clusters (about 290 m) suggests that events do not occur on well-defined
euclidean fault core surfaces, but rather that the damage zone surrounding
faults may be seismically active at depth. Finally, we propose a connection
between our methodology and multi-scale spatial analysis, based on the
derivation of spatial fractal dimension of about 1.8 for the set of hypocenters
in the Mnt Lewis area, consistent with recent observations on relocated
catalogs
- …