Search CORE

1,445 research outputs found

Planar Object Tracking in the Wild: A Benchmark

Author: Liang Pengpeng
Liao Chunyuan
Ling Haibin
Lu Hu
Wang Liming
Wu Yifan
Publication venue
Publication date: 22/05/2018
Field of study

Planar object tracking is an actively studied problem in vision-based robotic applications. While several benchmarks have been constructed for evaluating state-of-the-art algorithms, there is a lack of video sequences captured in the wild rather than in constrained laboratory environment. In this paper, we present a carefully designed planar object tracking benchmark containing 210 videos of 30 planar objects sampled in the natural environment. In particular, for each object, we shoot seven videos involving various challenging factors, namely scale change, rotation, perspective distortion, motion blur, occlusion, out-of-view, and unconstrained. The ground truth is carefully annotated semi-manually to ensure the quality. Moreover, eleven state-of-the-art algorithms are evaluated on the benchmark using two evaluation metrics, with detailed analysis provided for the evaluation results. We expect the proposed benchmark to benefit future studies on planar object tracking.Comment: Accepted by ICRA 201

arXiv.org e-Print Archive

Crossref

Do-It-Yourself Single Camera 3D Pointer Input Device

Author: Llanos Bernard
Yang Yee-Hong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/09/2018
Field of study

We present a new algorithm for single camera 3D reconstruction, or 3D input for human-computer interfaces, based on precise tracking of an elongated object, such as a pen, having a pattern of colored bands. To configure the system, the user provides no more than one labelled image of a handmade pointer, measurements of its colored bands, and the camera's pinhole projection matrix. Other systems are of much higher cost and complexity, requiring combinations of multiple cameras, stereocameras, and pointers with sensors and lights. Instead of relying on information from multiple devices, we examine our single view more closely, integrating geometric and appearance constraints to robustly track the pointer in the presence of occlusion and distractor objects. By probing objects of known geometry with the pointer, we demonstrate acceptable accuracy of 3D localization.Comment: 8 pages, 6 figures, 2018 15th Conference on Computer and Robot Visio

arXiv.org e-Print Archive

Crossref

Evaluation of CNN-based Single-Image Depth Estimation Methods

Author: A Saxena
Arno Knapitsch
F Liu
N Silberman
P Dollár
R Garg
S Kim
Publication venue
Publication date: 01/01/2018
Field of study

While an increasing interest in deep models for single-image depth estimation methods can be observed, established schemes for their evaluation are still limited. We propose a set of novel quality criteria, allowing for a more detailed analysis by focusing on specific characteristics of depth maps. In particular, we address the preservation of edges and planar regions, depth consistency, and absolute distance accuracy. In order to employ these metrics to evaluate and compare state-of-the-art single-image depth estimation approaches, we provide a new high-quality RGB-D dataset. We used a DSLR camera together with a laser scanner to acquire high-resolution images and highly accurate depth maps. Experimental results show the validity of our proposed evaluation protocol

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Crossref

Projector-Based Augmentation

Author: Bimber Oliver (Prof. Dr.)
Publication venue
Publication date: 21/02/2006
Field of study

Projector-based augmentation approaches hold the potential of combining the advantages of well-establishes spatial virtual reality and spatial augmented reality. Immersive, semi-immersive and augmented visualizations can be realized in everyday environments – without the need for special projection screens and dedicated display configurations. Limitations of mobile devices, such as low resolution and small field of view, focus constrains, and ergonomic issues can be overcome in many cases by the utilization of projection technology. Thus, applications that do not require mobility can benefit from efficient spatial augmentations. Examples range from edutainment in museums (such as storytelling projections onto natural stone walls in historical buildings) to architectural visualizations (such as augmentations of complex illumination simulations or modified surface materials in real building structures). This chapter describes projector-camera methods and multi-projector techniques that aim at correcting geometric aberrations, compensating local and global radiometric effects, and improving focus properties of images projected onto everyday surfaces

Online-Publikationssystem der Bauhaus-Universität Weimar

Robotic Cameraman for Augmented Reality based Broadcast and Demonstration

Author: Yan Dingtian
Publication venue
Publication date: 01/04/2020
Field of study

In recent years, a number of large enterprises have gradually begun to use vari-ous Augmented Reality technologies to prominently improve the audiences’ view oftheir products. Among them, the creation of an immersive virtual interactive scenethrough the projection has received extensive attention, and this technique refers toprojection SAR, which is short for projection spatial augmented reality. However,as the existing projection-SAR systems have immobility and limited working range,they have a huge difficulty to be accepted and used in human daily life. Therefore,this thesis research has proposed a technically feasible optimization scheme so thatit can be practically applied to AR broadcasting and demonstrations. Based on three main techniques required by state-of-art projection SAR applica-tions, this thesis has created a novel mobile projection SAR cameraman for ARbroadcasting and demonstration. Firstly, by combining the CNN scene parsingmodel and multiple contour extractors, the proposed contour extraction pipelinecan always detect the optimal contour information in non-HD or blurred images.This algorithm reduces the dependency on high quality visual sensors and solves theproblems of low contour extraction accuracy in motion blurred images. Secondly, aplane-based visual mapping algorithm is introduced to solve the difficulties of visualmapping in these low-texture scenarios. Finally, a complete process of designing theprojection SAR cameraman robot is introduced. This part has solved three mainproblems in mobile projection-SAR applications: (i) a new method for marking con-tour on projection model is proposed to replace the model rendering process. Bycombining contour features and geometric features, users can identify objects oncolourless model easily. (ii) a camera initial pose estimation method is developedbased on visual tracking algorithms, which can register the start pose of robot to thewhole scene in Unity3D. (iii) a novel data transmission approach is introduced to establishes a link between external robot and the robot in Unity3D simulation work-space. This makes the robotic cameraman can simulate its trajectory in Unity3D simulation work-space and project correct virtual content. Our proposed mobile projection SAR system has made outstanding contributionsto the academic value and practicality of the existing projection SAR technique. Itfirstly solves the problem of limited working range. When the system is running ina large indoor scene, it can follow the user and project dynamic interactive virtualcontent automatically instead of increasing the number of visual sensors. Then,it creates a more immersive experience for audience since it supports the user hasmore body gestures and richer virtual-real interactive plays. Lastly, a mobile systemdoes not require up-front frameworks and cheaper and has provided the public aninnovative choice for indoor broadcasting and exhibitions

University of Essex Research Repository

On Rendering Synthetic Images for Training an Object Detector

Author: Fua Pascal
Lepetit Vincent
Rozantsev Artem
Publication venue: 'Elsevier BV'
Publication date: 16/06/2014
Field of study

We propose a novel approach to synthesizing images that are effective for training object detectors. Starting from a small set of real images, our algorithm estimates the rendering parameters required to synthesize similar images given a coarse 3D model of the target object. These parameters can then be reused to generate an unlimited number of training images of the object of interest in arbitrary 3D poses, which can then be used to increase classification performances. A key insight of our approach is that the synthetically generated images should be similar to real images, not in terms of image quality, but rather in terms of features used during the detector training. We show in the context of drone, plane, and car detection that using such synthetically generated images yields significantly better performances than simply perturbing real images or even synthesizing images in such way that they look very realistic, as is often done when only limited amounts of training data are available

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Handling photographic imperfections and aliasing in augmented reality

Author: Bartz Dirk
Fischer Jan
Publication venue: Universität Tübingen
Publication date: 11/10/2012
Field of study

In video see-through augmented reality, virtual objects are overlaid over images delivered by a digital video camera. One particular problem of this image mixing process is the fact that the visual appearance of the computer-generated graphics differs strongly from the real background image. In typical augmented reality systems, standard real-time rendering techniques are used for displaying virtual objects. These fast, but relatively simplistic methods create an artificial, almost "plastic-like" look for the graphical elements. In this paper, methods for incorporating two particular camera image effects in virtual overlays are described. The first effect is camera image noise, which is contained in the data delivered by the CCD chip used for capturing the real scene. The second effect is motion blur, which is caused by the temporal integration of color intensities on the CCD chip during fast movements of the camera or observed objects, resulting in a blurred camera image. Graphical objects rendered with standard methods neither contain image noise nor motion blur. This is one of the factors which makes the virtual objects stand out from the camera image and contributes to the perceptual difference between real and virtual scene elements. Here, approaches for mimicking both camera image noise and motion blur in the graphical representation of virtual objects are proposed. An algorithm for generating a realistic imitation of image noise based on a camera calibration step is described. A rendering method which produces motion blur according to the current camera movement is presented. As a by-product of the described rendering pipeline, it becomes possible to perform a smooth blending between virtual objects and the camera image at their boundary. An implementation of the new rendering methods for virtual objects is described, which utilizes the programmability of modern graphics processing units (GPUs) and is capable of delivering real-time frame rates

Publikationsserver der Universität Tübingen

Matterport3D: Learning from RGB-D Data in Indoor Environments

Author: Chang Angel
Dai Angela
Funkhouser Thomas
Halber Maciej
Nießner Matthias
Savva Manolis
Song Shuran
Zeng Andy
Zhang Yinda
Publication venue
Publication date: 01/01/2017
Field of study

Access to large, diverse RGB-D datasets is critical for training RGB-D scene understanding algorithms. However, existing datasets still cover only a limited number of views or a restricted scale of spaces. In this paper, we introduce Matterport3D, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400 RGB-D images of 90 building-scale scenes. Annotations are provided with surface reconstructions, camera poses, and 2D and 3D semantic segmentations. The precise global alignment and comprehensive, diverse panoramic set of views over entire buildings enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref