174 research outputs found

    Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices

    Full text link
    For many applications in low-power real-time robotics, stereo cameras are the sensors of choice for depth perception as they are typically cheaper and more versatile than their active counterparts. Their biggest drawback, however, is that they do not directly sense depth maps; instead, these must be estimated through data-intensive processes. Therefore, appropriate algorithm selection plays an important role in achieving the desired performance characteristics. Motivated by applications in space and mobile robotics, we implement and evaluate a FPGA-accelerated adaptation of the ELAS algorithm. Despite offering one of the best trade-offs between efficiency and accuracy, ELAS has only been shown to run at 1.5-3 fps on a high-end CPU. Our system preserves all intriguing properties of the original algorithm, such as the slanted plane priors, but can achieve a frame rate of 47fps whilst consuming under 4W of power. Unlike previous FPGA based designs, we take advantage of both components on the CPU/FPGA System-on-Chip to showcase the strategy necessary to accelerate more complex and computationally diverse algorithms for such low power, real-time systems.Comment: 8 pages, 7 figures, 2 table

    Staple: Complementary Learners for Real-Time Tracking

    Full text link
    Correlation Filter-based trackers have recently achieved excellent performance, showing great robustness to challenging situations exhibiting motion blur and illumination changes. However, since the model that they learn depends strongly on the spatial layout of the tracked object, they are notoriously sensitive to deformation. Models based on colour statistics have complementary traits: they cope well with variation in shape, but suffer when illumination is not consistent throughout a sequence. Moreover, colour distributions alone can be insufficiently discriminative. In this paper, we show that a simple tracker combining complementary cues in a ridge regression framework can operate faster than 80 FPS and outperform not only all entries in the popular VOT14 competition, but also recent and far more sophisticated trackers according to multiple benchmarks.Comment: To appear in CVPR 201

    ROAM: a Rich Object Appearance Model with Application to Rotoscoping

    Get PDF
    Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition, editing and manipulation of the segments of interest. Sticking to this prevalent rotoscoping paradigm, we propose a novel framework to capture and track the visual aspect of an arbitrary object in a scene, given a first closed outline of this object. This model combines a collection of local foreground/background appearance models spread along the outline, a global appearance model of the enclosed object and a set of distinctive foreground landmarks. The structure of this rich appearance model allows simple initialization, efficient iterative optimization with exact minimization at each step, and on-line adaptation in videos. We demonstrate qualitatively and quantitatively the merit of this framework through comparisons with tools based on either dynamic segmentation with a closed curve or pixel-wise binary labelling

    An evaluation of recent local image descriptors for real-world applications of image matching

    Get PDF
    This paper discusses and compares the best and most recent local descriptors, evaluating them on increasingly complex image matching tasks, encompassing planar and non-planar scenarios under severe viewpoint changes. This evaluation, aimed at assessing descriptor suitability for real-world applications, leverages the concept of approximated overlap error as a means to naturally extend to non-planar scenes the standard metric used for planar scenes. According to the evaluation results, most descriptors exhibit a gradual performance degradation in the transition from planar to non-planar scenes. The best descriptors are those capable of capturing well not only the local image context, but also the global scene structure. Data-driven approaches are shown to have reached the matching robustness and accuracy of the best hand-crafted descriptor

    Study of Saiga Horn Using High-Performance Liquid Chromatography with Mass Spectrometry

    Get PDF
    The saiga horns have been investigated the using of modern analytic methods. High-performance liquid chromatography (HPLC) with mass-spectrometric (MS and MS/MS) detection and polyacrylamide gel electrophoresis (PAGE) were used. It could be concluded that basic proteins of the saiga horns are keratins and collagen. The basic representation protein in all samples is keratin type I microfibrillar (from sheep), keratin type II microfibrillar (from sheep), collagen type I (α1) (from bovine) and collagen type I (α2) (from bovine). Free amino acids we determined in all samples are nontreated by enzyme

    DGPose: Deep Generative Models for Human Body Analysis

    Get PDF
    Deep generative modelling for human body analysis is an emerging problem with many interesting applications. However, the latent space learned by such approaches is typically not interpretable, resulting in less flexibility. In this work, we present deep generative models for human body analysis in which the body pose and the visual appearance are disentangled. Such a disentanglement allows independent manipulation of pose and appearance, and hence enables applications such as pose-transfer without specific training for such a task. Our proposed models, the Conditional-DGPose and the Semi-DGPose, have different characteristics. In the first, body pose labels are taken as conditioners, from a fully-supervised training set. In the second, our structured semi-supervised approach allows for pose estimation to be performed by the model itself and relaxes the need for labelled data. Therefore, the Semi-DGPose aims for the joint understanding and generation of people in images. It is not only capable of mapping images to interpretable latent representations but also able to map these representations back to the image space. We compare our models with relevant baselines, the ClothNet-Body and the Pose Guided Person Generation networks, demonstrating their merits on the Human3.6M, ChictopiaPlus and DeepFashion benchmarks.Comment: IJCV 2020 special issue on 'Generating Realistic Visual Data of Human Behavior' preprint. Keywords: deep generative models, semi-supervised learning, human pose estimation, variational autoencoders, generative adversarial network

    Learning to Simulate Realistic LiDARs

    Full text link
    Simulating realistic sensors is a challenging part in data generation for autonomous systems, often involving carefully handcrafted sensor design, scene properties, and physics modeling. To alleviate this, we introduce a pipeline for data-driven simulation of a realistic LiDAR sensor. We propose a model that learns a mapping between RGB images and corresponding LiDAR features such as raydrop or per-point intensities directly from real datasets. We show that our model can learn to encode realistic effects such as dropped points on transparent surfaces or high intensity returns on reflective materials. When applied to naively raycasted point clouds provided by off-the-shelf simulator software, our model enhances the data by predicting intensities and removing points based on the scene's appearance to match a real LiDAR sensor. We use our technique to learn models of two distinct LiDAR sensors and use them to improve simulated LiDAR data accordingly. Through a sample task of vehicle segmentation, we show that enhancing simulated point clouds with our technique improves downstream task performance.Comment: IROS2022 pape
    • 

    corecore