2,057 research outputs found

    Real-time refocusing using an FPGA-based standard plenoptic camera

    Get PDF
    Plenoptic cameras are receiving increased attention in scientific and commercial applications because they capture the entire structure of light in a scene, enabling optical transforms (such as focusing) to be applied computationally after the fact, rather than once and for all at the time a picture is taken. In many settings, real-time inter active performance is also desired, which in turn requires significant computational power due to the large amount of data required to represent a plenoptic image. Although GPUs have been shown to provide acceptable performance for real-time plenoptic rendering, their cost and power requirements make them prohibitive for embedded uses (such as in-camera). On the other hand, the computation to accomplish plenoptic rendering is well structured, suggesting the use of specialized hardware. Accordingly, this paper presents an array of switch-driven finite impulse response filters, implemented with FPGA to accomplish high-throughput spatial-domain rendering. The proposed architecture provides a power-efficient rendering hardware design suitable for full-video applications as required in broadcasting or cinematography. A benchmark assessment of the proposed hardware implementation shows that real-time performance can readily be achieved, with a one order of magnitude performance improvement over a GPU implementation and three orders ofmagnitude performance improvement over a general-purpose CPU implementation

    View recommendation for multi-camera demonstration-based training

    Get PDF
    While humans can effortlessly pick a view from multiple streams, automatically choosing the best view is a challenge. Choosing the best view from multi-camera streams poses a problem regarding which objective metrics should be considered. Existing works on view selection lack consensus about which metrics should be considered to select the best view. The literature on view selection describes diverse possible metrics. And strategies such as information-theoretic, instructional design, or aesthetics-motivated fail to incorporate all approaches. In this work, we postulate a strategy incorporating information-theoretic and instructional design-based objective metrics to select the best view from a set of views. Traditionally, information-theoretic measures have been used to find the goodness of a view, such as in 3D rendering. We adapted a similar measure known as the viewpoint entropy for real-world 2D images. Additionally, we incorporated similarity penalization to get a more accurate measure of the entropy of a view, which is one of the metrics for the best view selection. Since the choice of the best view is domain-dependent, we chose demonstration-based training scenarios as our use case. The limitation of our chosen scenarios is that they do not include collaborative training and solely feature a single trainer. To incorporate instructional design considerations, we included the trainer’s body pose, face, face when instructing, and hands visibility as metrics. To incorporate domain knowledge we included predetermined regions’ visibility as another metric. All of those metrics are taken into account to produce a parameterized view recommendation approach for demonstration-based training. An online study using recorded multi-camera video streams from a simulation environment was used to validate those metrics. Furthermore, the responses from the online study were used to optimize the view recommendation performance with a normalized discounted cumulative gain (NDCG) value of 0.912, which shows good performance with respect to matching user choices

    A Low-Dimensional Representation for Robust Partial Isometric Correspondences Computation

    Full text link
    Intrinsic isometric shape matching has become the standard approach for pose invariant correspondence estimation among deformable shapes. Most existing approaches assume global consistency, i.e., the metric structure of the whole manifold must not change significantly. While global isometric matching is well understood, only a few heuristic solutions are known for partial matching. Partial matching is particularly important for robustness to topological noise (incomplete data and contacts), which is a common problem in real-world 3D scanner data. In this paper, we introduce a new approach to partial, intrinsic isometric matching. Our method is based on the observation that isometries are fully determined by purely local information: a map of a single point and its tangent space fixes an isometry for both global and the partial maps. From this idea, we develop a new representation for partial isometric maps based on equivalence classes of correspondences between pairs of points and their tangent spaces. From this, we derive a local propagation algorithm that find such mappings efficiently. In contrast to previous heuristics based on RANSAC or expectation maximization, our method is based on a simple and sound theoretical model and fully deterministic. We apply our approach to register partial point clouds and compare it to the state-of-the-art methods, where we obtain significant improvements over global methods for real-world data and stronger guarantees than previous heuristic partial matching algorithms.Comment: 17 pages, 12 figure

    Cascaded Scene Flow Prediction using Semantic Segmentation

    Full text link
    Given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. Many existing approaches use superpixels for regularization, but may predict inconsistent shapes and motions inside rigidly moving objects. We instead assume that scenes consist of foreground objects rigidly moving in front of a static background, and use semantic cues to produce pixel-accurate scene flow estimates. Our cascaded classification framework accurately models 3D scenes by iteratively refining semantic segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields. We evaluate our method on the challenging KITTI autonomous driving benchmark, and show that accounting for the motion of segmented vehicles leads to state-of-the-art performance.Comment: International Conference on 3D Vision (3DV), 2017 (oral presentation
    • …
    corecore