612 research outputs found

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style

    3D Scanning System for Automatic High-Resolution Plant Phenotyping

    Full text link
    Thin leaves, fine stems, self-occlusion, non-rigid and slowly changing structures make plants difficult for three-dimensional (3D) scanning and reconstruction -- two critical steps in automated visual phenotyping. Many current solutions such as laser scanning, structured light, and multiview stereo can struggle to acquire usable 3D models because of limitations in scanning resolution and calibration accuracy. In response, we have developed a fast, low-cost, 3D scanning platform to image plants on a rotating stage with two tilting DSLR cameras centred on the plant. This uses new methods of camera calibration and background removal to achieve high-accuracy 3D reconstruction. We assessed the system's accuracy using a 3D visual hull reconstruction algorithm applied on 2 plastic models of dicotyledonous plants, 2 sorghum plants and 2 wheat plants across different sets of tilt angles. Scan times ranged from 3 minutes (to capture 72 images using 2 tilt angles), to 30 minutes (to capture 360 images using 10 tilt angles). The leaf lengths, widths, areas and perimeters of the plastic models were measured manually and compared to measurements from the scanning system: results were within 3-4% of each other. The 3D reconstructions obtained with the scanning system show excellent geometric agreement with all six plant specimens, even plants with thin leaves and fine stems.Comment: 8 papes, DICTA 201

    Co-Fusion: Real-time Segmentation, Tracking and Fusion of Multiple Objects

    Get PDF
    In this paper we introduce Co-Fusion, a dense SLAM system that takes a live stream of RGB-D images as input and segments the scene into different objects (using either motion or semantic cues) while simultaneously tracking and reconstructing their 3D shape in real time. We use a multiple model fitting approach where each object can move independently from the background and still be effectively tracked and its shape fused over time using only the information from pixels associated with that object label. Previous attempts to deal with dynamic scenes have typically considered moving regions as outliers, and consequently do not model their shape or track their motion over time. In contrast, we enable the robot to maintain 3D models for each of the segmented objects and to improve them over time through fusion. As a result, our system can enable a robot to maintain a scene description at the object level which has the potential to allow interactions with its working environment; even in the case of dynamic scenes.Comment: International Conference on Robotics and Automation (ICRA) 2017, http://visual.cs.ucl.ac.uk/pubs/cofusion, https://github.com/martinruenz/co-fusio

    RGB-D datasets using microsoft kinect or similar sensors: a survey

    Get PDF
    RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

    EventNeRF: Neural Radiance Fields from a Single Colour Event Camera

    Get PDF
    Asynchronously operating event cameras find many applications due to theirhigh dynamic range, no motion blur, low latency and low data bandwidth. Thefield has seen remarkable progress during the last few years, and existingevent-based 3D reconstruction approaches recover sparse point clouds of thescene. However, such sparsity is a limiting factor in many cases, especially incomputer vision and graphics, that has not been addressed satisfactorily sofar. Accordingly, this paper proposes the first approach for 3D-consistent,dense and photorealistic novel view synthesis using just a single colour eventstream as input. At the core of our method is a neural radiance field trainedentirely in a self-supervised manner from events while preserving the originalresolution of the colour event channels. Next, our ray sampling strategy istailored to events and allows for data-efficient training. At test, our methodproduces results in the RGB space at unprecedented quality. We evaluate ourmethod qualitatively and quantitatively on several challenging synthetic andreal scenes and show that it produces significantly denser and more visuallyappealing renderings than the existing methods. We also demonstrate robustnessin challenging scenarios with fast motion and under low lighting conditions. Wewill release our dataset and our source code to facilitate the research field,see https://4dqv.mpi-inf.mpg.de/EventNeRF/.<br

    EventNeRF: Neural Radiance Fields from a Single Colour Event Camera

    Full text link
    Asynchronously operating event cameras find many applications due to their high dynamic range, no motion blur, low latency and low data bandwidth. The field has seen remarkable progress during the last few years, and existing event-based 3D reconstruction approaches recover sparse point clouds of the scene. However, such sparsity is a limiting factor in many cases, especially in computer vision and graphics, that has not been addressed satisfactorily so far. Accordingly, this paper proposes the first approach for 3D-consistent, dense and photorealistic novel view synthesis using just a single colour event stream as input. At the core of our method is a neural radiance field trained entirely in a self-supervised manner from events while preserving the original resolution of the colour event channels. Next, our ray sampling strategy is tailored to events and allows for data-efficient training. At test, our method produces results in the RGB space at unprecedented quality. We evaluate our method qualitatively and quantitatively on several challenging synthetic and real scenes and show that it produces significantly denser and more visually appealing renderings than the existing methods. We also demonstrate robustness in challenging scenarios with fast motion and under low lighting conditions. We will release our dataset and our source code to facilitate the research field, see https://4dqv.mpi-inf.mpg.de/EventNeRF/.Comment: 18 pages, 18 figures, 3 table

    Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects

    Get PDF
    Hand motion capture with an RGB-D sensor gained recently a lot of research attention, however, even most recent approaches focus on the case of a single isolated hand. We focus instead on hands that interact with other hands or with a rigid or articulated object. Our framework successfully captures motion in such scenarios by combining a generative model with discriminatively trained salient points, collision detection and physics simulation to achieve a low tracking error with physically plausible poses. All components are unified in a single objective function that can be optimized with standard optimization techniques. We initially assume a-priori knowledge of the object’s shape and skeleton. In case of unknown object shape there are existing 3d reconstruction methods that capitalize on distinctive geometric or texture features. These methods though fail for textureless and highly symmetric objects like household articles, mechanical parts or toys. We show that extracting 3d hand motion for in-hand scanning e↵ectively facilitates the reconstruction of such objects and we fuse the rich additional information of hands into a 3d reconstruction pipeline. Finally, although shape reconstruction is enough for rigid objects, there is a lack of tools that build rigged models of articulated objects that deform realistically using RGB-D data. We propose a method that creates a fully rigged model consisting of a watertight mesh, embedded skeleton and skinning weights by employing a combination of deformable mesh tracking, motion segmentation based on spectral clustering and skeletonization based on mean curvature flow

    A survey on human performance capture and animation

    Get PDF
    With the rapid development of computing technology, three-dimensional (3D) human body models and their dynamic motions are widely used in the digital entertainment industry. Human perfor- mance mainly involves human body shapes and motions. Key research problems include how to capture and analyze static geometric appearance and dynamic movement of human bodies, and how to simulate human body motions with physical e�ects. In this survey, according to main research directions of human body performance capture and animation, we summarize recent advances in key research topics, namely human body surface reconstruction, motion capture and synthesis, as well as physics-based motion sim- ulation, and further discuss future research problems and directions. We hope this will be helpful for readers to have a comprehensive understanding of human performance capture and animatio
    • …
    corecore