7,366 research outputs found

    Temporal shape super-resolution by intra-frame motion encoding using high-fps structured light

    Full text link
    One of the solutions of depth imaging of moving scene is to project a static pattern on the object and use just a single image for reconstruction. However, if the motion of the object is too fast with respect to the exposure time of the image sensor, patterns on the captured image are blurred and reconstruction fails. In this paper, we impose multiple projection patterns into each single captured image to realize temporal super resolution of the depth image sequences. With our method, multiple patterns are projected onto the object with higher fps than possible with a camera. In this case, the observed pattern varies depending on the depth and motion of the object, so we can extract temporal information of the scene from each single image. The decoding process is realized using a learning-based approach where no geometric calibration is needed. Experiments confirm the effectiveness of our method where sequential shapes are reconstructed from a single image. Both quantitative evaluations and comparisons with recent techniques were also conducted.Comment: 9 pages, Published at the International Conference on Computer Vision (ICCV 2017

    A Systematic Survey of ML Datasets for Prime CV Research Areas-Media and Metadata

    Get PDF
    The ever-growing capabilities of computers have enabled pursuing Computer Vision through Machine Learning (i.e., MLCV). ML tools require large amounts of information to learn from (ML datasets). These are costly to produce but have received reduced attention regarding standardization. This prevents the cooperative production and exploitation of these resources, impedes countless synergies, and hinders ML research. No global view exists of the MLCV dataset tissue. Acquiring it is fundamental to enable standardization. We provide an extensive survey of the evolution and current state of MLCV datasets (1994 to 2019) for a set of specific CV areas as well as a quantitative and qualitative analysis of the results. Data were gathered from online scientific databases (e.g., Google Scholar, CiteSeerX). We reveal the heterogeneous plethora that comprises the MLCV dataset tissue; their continuous growth in volume and complexity; the specificities of the evolution of their media and metadata components regarding a range of aspects; and that MLCV progress requires the construction of a global standardized (structuring, manipulating, and sharing) MLCV "library". Accordingly, we formulate a novel interpretation of this dataset collective as a global tissue of synthetic cognitive visual memories and define the immediately necessary steps to advance its standardization and integration

    FEW SHOT PHOTOGRAMETRY: A COMPARISON BETWEEN NERF AND MVS-SFM FOR THE DOCUMENTATION OF CULTURAL HERITAGE

    Get PDF
    3D documentation methods for Digital Cultural Heritage (DCH) domain is a field that becomes increasingly interdisciplinary, breaking down boundaries that have long separated experts from different domains. In the past, there has been an ambiguous claim for ownership of skills, methodologies, and expertise in the heritage sciences. This study aims to contribute to the dialogue between these different disciplines by presenting a novel approach for 3D documentation of an ancient statue. The method combines TLS acquisition and MVS pipeline using images from a DJI Mavic 2 drone. Additionally, the study compares the accuracy and final product of the Deep Points (DP) and Neural Radiance Fields (NeRF) methods, using the TLS acquisition as validation ground truth. Firstly, a TLS acquisition was performed on an ancient statue using a Faro Focus 2 scanner. Next, a multi-view stereo (MVS) pipeline was adopted using 2D images captured by a Mini-2 DJI Mavic 2 drone from a distance of approximately 1 meter around the statue. Finally, the same images were used to train and run the NeRF network after being reduced by 90%. The main contribution of this paper is to improve our understanding of this method and compare the accuracy and final product of two different approaches - direct projection (DP) and NeRF - by exploiting a TLS acquisition as the validation ground truth. Results show that the NeRF approach outperforms DP in terms of accuracy and produces a more realistic final product. This paper has important implications for the field of CH preservation, as it offers a new and effective method for generating 3D models of ancient statues. This technology can help to document and preserve important cultural artifacts for future generations, while also providing new insights into the history and culture of different civilizations. Overall, the results of this study demonstrate the potential of combining TLS and NeRF for generating accurate and realistic 3D models of ancient statues

    Model-Based Environmental Visual Perception for Humanoid Robots

    Get PDF
    The visual perception of a robot should answer two fundamental questions: What? and Where? In order to properly and efficiently reply to these questions, it is essential to establish a bidirectional coupling between the external stimuli and the internal representations. This coupling links the physical world with the inner abstraction models by sensor transformation, recognition, matching and optimization algorithms. The objective of this PhD is to establish this sensor-model coupling
    • …
    corecore