3,853 research outputs found
Semantic Visual Localization
Robust visual localization under a wide range of viewing conditions is a
fundamental problem in computer vision. Handling the difficult cases of this
problem is not only very challenging but also of high practical relevance,
e.g., in the context of life-long localization for augmented reality or
autonomous robots. In this paper, we propose a novel approach based on a joint
3D geometric and semantic understanding of the world, enabling it to succeed
under conditions where previous approaches failed. Our method leverages a novel
generative model for descriptor learning, trained on semantic scene completion
as an auxiliary task. The resulting 3D descriptors are robust to missing
observations by encoding high-level 3D geometric and semantic information.
Experiments on several challenging large-scale localization datasets
demonstrate reliable localization under extreme viewpoint, illumination, and
geometry changes
Photon counting compressive depth mapping
We demonstrate a compressed sensing, photon counting lidar system based on
the single-pixel camera. Our technique recovers both depth and intensity maps
from a single under-sampled set of incoherent, linear projections of a scene of
interest at ultra-low light levels around 0.5 picowatts. Only two-dimensional
reconstructions are required to image a three-dimensional scene. We demonstrate
intensity imaging and depth mapping at 256 x 256 pixel transverse resolution
with acquisition times as short as 3 seconds. We also show novelty filtering,
reconstructing only the difference between two instances of a scene. Finally,
we acquire 32 x 32 pixel real-time video for three-dimensional object tracking
at 14 frames-per-second.Comment: 16 pages, 8 figure
SeasonDepth: Cross-Season Monocular Depth Prediction Dataset and Benchmark under Multiple Environments
Different environments pose a great challenge to the outdoor robust visual
perception for long-term autonomous driving and the generalization of
learning-based algorithms on different environmental effects is still an open
problem. Although monocular depth prediction has been well studied recently,
there is few work focusing on the robust learning-based depth prediction across
different environments, e.g. changing illumination and seasons, owing to the
lack of such a multi-environment real-world dataset and benchmark. To this end,
the first cross-season monocular depth prediction dataset and benchmark
SeasonDepth is built based on CMU Visual Localization dataset. To benchmark the
depth estimation performance under different environments, we investigate
representative and recent state-of-the-art open-source supervised,
self-supervised and domain adaptation depth prediction methods from KITTI
benchmark using several newly-formulated metrics. Through extensive
experimental evaluation on the proposed dataset, the influence of multiple
environments on performance and robustness is analyzed qualitatively and
quantitatively, showing that the long-term monocular depth prediction is still
challenging even with fine-tuning. We further give promising avenues that
self-supervised training and stereo geometry constraint help to enhance the
robustness to changing environments. The dataset is available on
https://seasondepth.github.io, and benchmark toolkit is available on
https://github.com/SeasonDepth/SeasonDepth.Comment: 19 pages, 13 figure
Single-picture reconstruction and rendering of trees for plausible vegetation synthesis
State-of-the-art approaches for tree reconstruction either put limiting constraints on the input side (requiring multiple photographs, a scanned point cloud or intensive user input) or provide a representation only suitable for front views of the tree. In this paper we present a complete pipeline for synthesizing and rendering detailed trees from a single photograph with minimal user effort. Since the overall shape and appearance of each tree is recovered from a single photograph of the tree crown, artists can benefit from georeferenced images to populate landscapes with native tree species. A key element of our approach is a compact representation of dense tree crowns through a radial distance map. Our first contribution is an automatic algorithm for generating such representations from a single exemplar image of a tree. We create a rough estimate of the crown shape by solving a thin-plate energy minimization problem, and then add detail through a simplified shape-from-shading approach. The use of seamless texture synthesis results in an image-based representation that can be rendered from arbitrary view directions at different levels of detail. Distant trees benefit from an output-sensitive algorithm inspired on relief mapping. For close-up trees we use a billboard cloud where leaflets are distributed inside the crown shape through a space colonization algorithm. In both cases our representation ensures efficient preservation of the crown shape. Major benefits of our approach include: it recovers the overall shape from a single tree image, involves no tree modeling knowledge and minimal authoring effort, and the associated image-based representation is easy to compress and thus suitable for network streaming.Peer ReviewedPostprint (author's final draft
A 4D Light-Field Dataset and CNN Architectures for Material Recognition
We introduce a new light-field dataset of materials, and take advantage of
the recent success of deep learning to perform material recognition on the 4D
light-field. Our dataset contains 12 material categories, each with 100 images
taken with a Lytro Illum, from which we extract about 30,000 patches in total.
To the best of our knowledge, this is the first mid-size dataset for
light-field images. Our main goal is to investigate whether the additional
information in a light-field (such as multiple sub-aperture views and
view-dependent reflectance effects) can aid material recognition. Since
recognition networks have not been trained on 4D images before, we propose and
compare several novel CNN architectures to train on light-field images. In our
experiments, the best performing CNN architecture achieves a 7% boost compared
with 2D image classification (70% to 77%). These results constitute important
baselines that can spur further research in the use of CNNs for light-field
applications. Upon publication, our dataset also enables other novel
applications of light-fields, including object detection, image segmentation
and view interpolation.Comment: European Conference on Computer Vision (ECCV) 201
GENERATION OF FORESTS ON TERRAIN WITH DYNAMIC LIGHTING AND SHADOWING
The purpose of this research project is to exhibit an efficient method of creating dynamic lighting and shadowing for the generation of forests on terrain. In this research project, I use textures which contain images of trees from a bird’s eye view in order to create a high scale forest. Furthermore, by manipulating the transparency and color of the textures according to the algorithmic calculations of light and shadow on terrain, I provide the functionality of dynamic lighting and shadowing. Finally, by analyzing the OpenGL pipeline, I design my code in order to allow efficient rendering of the forest
Local Motion Planner for Autonomous Navigation in Vineyards with a RGB-D Camera-Based Algorithm and Deep Learning Synergy
With the advent of agriculture 3.0 and 4.0, researchers are increasingly
focusing on the development of innovative smart farming and precision
agriculture technologies by introducing automation and robotics into the
agricultural processes. Autonomous agricultural field machines have been
gaining significant attention from farmers and industries to reduce costs,
human workload, and required resources. Nevertheless, achieving sufficient
autonomous navigation capabilities requires the simultaneous cooperation of
different processes; localization, mapping, and path planning are just some of
the steps that aim at providing to the machine the right set of skills to
operate in semi-structured and unstructured environments. In this context, this
study presents a low-cost local motion planner for autonomous navigation in
vineyards based only on an RGB-D camera, low range hardware, and a dual layer
control algorithm. The first algorithm exploits the disparity map and its depth
representation to generate a proportional control for the robotic platform.
Concurrently, a second back-up algorithm, based on representations learning and
resilient to illumination variations, can take control of the machine in case
of a momentaneous failure of the first block. Moreover, due to the double
nature of the system, after initial training of the deep learning model with an
initial dataset, the strict synergy between the two algorithms opens the
possibility of exploiting new automatically labeled data, coming from the
field, to extend the existing model knowledge. The machine learning algorithm
has been trained and tested, using transfer learning, with acquired images
during different field surveys in the North region of Italy and then optimized
for on-device inference with model pruning and quantization. Finally, the
overall system has been validated with a customized robot platform in the
relevant environment
- …