22,360 research outputs found
High-Precision Localization Using Ground Texture
Location-aware applications play an increasingly critical role in everyday
life. However, satellite-based localization (e.g., GPS) has limited accuracy
and can be unusable in dense urban areas and indoors. We introduce an
image-based global localization system that is accurate to a few millimeters
and performs reliable localization both indoors and outside. The key idea is to
capture and index distinctive local keypoints in ground textures. This is based
on the observation that ground textures including wood, carpet, tile, concrete,
and asphalt may look random and homogeneous, but all contain cracks, scratches,
or unique arrangements of fibers. These imperfections are persistent, and can
serve as local features. Our system incorporates a downward-facing camera to
capture the fine texture of the ground, together with an image processing
pipeline that locates the captured texture patch in a compact database
constructed offline. We demonstrate the capability of our system to robustly,
accurately, and quickly locate test images on various types of outdoor and
indoor ground surfaces
Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View
We propose a method for predicting the 3D shape of a deformable surface from
a single view. By contrast with previous approaches, we do not need a
pre-registered template of the surface, and our method is robust to the lack of
texture and partial occlusions. At the core of our approach is a {\it
geometry-aware} deep architecture that tackles the problem as usually done in
analytic solutions: first perform 2D detection of the mesh and then estimate a
3D shape that is geometrically consistent with the image. We train this
architecture in an end-to-end manner using a large dataset of synthetic
renderings of shapes under different levels of deformation, material
properties, textures and lighting conditions. We evaluate our approach on a
test split of this dataset and available real benchmarks, consistently
improving state-of-the-art solutions with a significantly lower computational
time.Comment: Accepted at CVPR 201
Ambient Sound Provides Supervision for Visual Learning
The sound of crashing waves, the roar of fast-moving cars -- sound conveys
important information about the objects in our surroundings. In this work, we
show that ambient sounds can be used as a supervisory signal for learning
visual models. To demonstrate this, we train a convolutional neural network to
predict a statistical summary of the sound associated with a video frame. We
show that, through this process, the network learns a representation that
conveys information about objects and scenes. We evaluate this representation
on several recognition tasks, finding that its performance is comparable to
that of other state-of-the-art unsupervised learning methods. Finally, we show
through visualizations that the network learns units that are selective to
objects that are often associated with characteristic sounds.Comment: ECCV 201
Creating Simplified 3D Models with High Quality Textures
This paper presents an extension to the KinectFusion algorithm which allows
creating simplified 3D models with high quality RGB textures. This is achieved
through (i) creating model textures using images from an HD RGB camera that is
calibrated with Kinect depth camera, (ii) using a modified scheme to update
model textures in an asymmetrical colour volume that contains a higher number
of voxels than that of the geometry volume, (iii) simplifying dense polygon
mesh model using quadric-based mesh decimation algorithm, and (iv) creating and
mapping 2D textures to every polygon in the output 3D model. The proposed
method is implemented in real-time by means of GPU parallel processing.
Visualization via ray casting of both geometry and colour volumes provides
users with a real-time feedback of the currently scanned 3D model. Experimental
results show that the proposed method is capable of keeping the model texture
quality even for a heavily decimated model and that, when reconstructing small
objects, photorealistic RGB textures can still be reconstructed.Comment: 2015 International Conference on Digital Image Computing: Techniques
and Applications (DICTA), Page 1 -
ARTSCENE: A Neural System for Natural Scene Classification
How do humans rapidly recognize a scene? How can neural models capture this biological competence to achieve state-of-the-art scene classification? The ARTSCENE neural system classifies natural scene photographs by using multiple spatial scales to efficiently accumulate evidence for gist and texture. ARTSCENE embodies a coarse-to-fine Texture Size Ranking Principle whereby spatial attention processes multiple scales of scenic information, ranging from global gist to local properties of textures. The model can incrementally learn and predict scene identity by gist information alone and can improve performance through selective attention to scenic textures of progressively smaller size. ARTSCENE discriminates 4 landscape scene categories (coast, forest, mountain and countryside) with up to 91.58% correct on a test set, outperforms alternative models in the literature which use biologically implausible computations, and outperforms component systems that use either gist or texture information alone. Model simulations also show that adjacent textures form higher-order features that are also informative for scene recognition.National Science Foundation (NSF SBE-0354378); Office of Naval Research (N00014-01-1-0624
- …