5,285 research outputs found

    GASP : Geometric Association with Surface Patches

    Full text link
    A fundamental challenge to sensory processing tasks in perception and robotics is the problem of obtaining data associations across views. We present a robust solution for ascertaining potentially dense surface patch (superpixel) associations, requiring just range information. Our approach involves decomposition of a view into regularized surface patches. We represent them as sequences expressing geometry invariantly over their superpixel neighborhoods, as uniquely consistent partial orderings. We match these representations through an optimal sequence comparison metric based on the Damerau-Levenshtein distance - enabling robust association with quadratic complexity (in contrast to hitherto employed joint matching formulations which are NP-complete). The approach is able to perform under wide baselines, heavy rotations, partial overlaps, significant occlusions and sensor noise. The technique does not require any priors -- motion or otherwise, and does not make restrictive assumptions on scene structure and sensor movement. It does not require appearance -- is hence more widely applicable than appearance reliant methods, and invulnerable to related ambiguities such as textureless or aliased content. We present promising qualitative and quantitative results under diverse settings, along with comparatives with popular approaches based on range as well as RGB-D data.Comment: International Conference on 3D Vision, 201

    Deep Depth Completion of a Single RGB-D Image

    Full text link
    The goal of our work is to complete the depth channel of an RGB-D image. Commodity-grade depth cameras often fail to sense depth for shiny, bright, transparent, and distant surfaces. To address this problem, we train a deep network that takes an RGB image as input and predicts dense surface normals and occlusion boundaries. Those predictions are then combined with raw depth observations provided by the RGB-D camera to solve for depths for all pixels, including those missing in the original observation. This method was chosen over others (e.g., inpainting depths directly) as the result of extensive experiments with a new depth completion benchmark dataset, where holes are filled in training data through the rendering of surface reconstructions created from multiview RGB-D scans. Experiments with different network inputs, depth representations, loss functions, optimization methods, inpainting methods, and deep depth estimation networks show that our proposed approach provides better depth completions than these alternatives.Comment: Accepted by CVPR2018 (Spotlight). Project webpage: http://deepcompletion.cs.princeton.edu/ This version includes supplementary materials which provide more implementation details, quantitative evaluation, and qualitative results. Due to file size limit, please check project website for high-res pape

    Quality Enhancement of 3D Models Reconstructed By RGB-D Camera Systems

    Get PDF
    Low-cost RGB-D cameras like Microsoft\u27s Kinect capture RGB data for each vertex while reconstructing 3D models from objects with obvious drawbacks of poor mesh and texture qualities due to their hardware limitations. In this thesis we propose a combined method that enhances geometrically and chromatically 3D models reconstructed by RGB-D camera systems. Our approach utilizes Butterfly Subdivision and Surface Fitting techniques to generate smoother triangle surface meshes, where sharp features can be well preserved or minimized by different Surface Fitting algorithms. Additionally the global contrast of mesh textures is enhanced by using a modified Histogram Equalization algorithm, in which the new intensity of each vertex is obtained by applying cumulative distribution function and calculating the accumulated normalized histogram of the texture. A number of experimental results and comparisons demonstrate that our method efficiently and effectively improves the geometric and chromatic quality of 3D models reconstructed from RGB-D cameras

    Computer-aided analysis for the Mechanics of Granular Materials (MGM) experiment, part 2

    Get PDF
    Computer vision based analysis for the MGM experiment is continued and expanded into new areas. Volumetric strains of granular material triaxial test specimens have been measured from digitized images. A computer-assisted procedure is used to identify the edges of the specimen, and the edges are used in a 3-D model to estimate specimen volume. The results of this technique compare favorably to conventional measurements. A simplified model of the magnification caused by diffraction of light within the water of the test apparatus was also developed. This model yields good results when the distance between the camera and the test specimen is large compared to the specimen height. An algorithm for a more accurate 3-D magnification correction is also presented. The use of composite and RGB (red-green-blue) color cameras is discussed and potentially significant benefits from using an RGB camera are presented

    Cross-calibration of Time-of-flight and Colour Cameras

    Get PDF
    Time-of-flight cameras provide depth information, which is complementary to the photometric appearance of the scene in ordinary images. It is desirable to merge the depth and colour information, in order to obtain a coherent scene representation. However, the individual cameras will have different viewpoints, resolutions and fields of view, which means that they must be mutually calibrated. This paper presents a geometric framework for this multi-view and multi-modal calibration problem. It is shown that three-dimensional projective transformations can be used to align depth and parallax-based representations of the scene, with or without Euclidean reconstruction. A new evaluation procedure is also developed; this allows the reprojection error to be decomposed into calibration and sensor-dependent components. The complete approach is demonstrated on a network of three time-of-flight and six colour cameras. The applications of such a system, to a range of automatic scene-interpretation problems, are discussed.Comment: 18 pages, 12 figures, 3 table

    Combined Learned and Classical Methods for Real-Time Visual Perception in Autonomous Driving

    Full text link
    Autonomy, robotics, and Artificial Intelligence (AI) are among the main defining themes of next-generation societies. Of the most important applications of said technologies is driving automation which spans from different Advanced Driver Assistance Systems (ADAS) to full self-driving vehicles. Driving automation is promising to reduce accidents, increase safety, and increase access to mobility for more people such as the elderly and the handicapped. However, one of the main challenges facing autonomous vehicles is robust perception which can enable safe interaction and decision making. With so many sensors to perceive the environment, each with its own capabilities and limitations, vision is by far one of the main sensing modalities. Cameras are cheap and can provide rich information of the observed scene. Therefore, this dissertation develops a set of visual perception algorithms with a focus on autonomous driving as the target application area. This dissertation starts by addressing the problem of real-time motion estimation of an agent using only the visual input from a camera attached to it, a problem known as visual odometry. The visual odometry algorithm can achieve low drift rates over long-traveled distances. This is made possible through the innovative local mapping approach used. This visual odometry algorithm was then combined with my multi-object detection and tracking system. The tracking system operates in a tracking-by-detection paradigm where an object detector based on convolution neural networks (CNNs) is used. Therefore, the combined system can detect and track other traffic participants both in image domain and in 3D world frame while simultaneously estimating vehicle motion. This is a necessary requirement for obstacle avoidance and safe navigation. Finally, the operational range of traditional monocular cameras was expanded with the capability to infer depth and thus replace stereo and RGB-D cameras. This is accomplished through a single-stream convolution neural network which can output both depth prediction and semantic segmentation. Semantic segmentation is the process of classifying each pixel in an image and is an important step toward scene understanding. Literature survey, algorithms descriptions, and comprehensive evaluations on real-world datasets are presented.Ph.D.College of Engineering & Computer ScienceUniversity of Michiganhttps://deepblue.lib.umich.edu/bitstream/2027.42/153989/1/Mohamed Aladem Final Dissertation.pdfDescription of Mohamed Aladem Final Dissertation.pdf : Dissertatio
    • …
    corecore