174 research outputs found
Real-Time Dense Stereo Matching With ELAS on FPGA Accelerated Embedded Devices
For many applications in low-power real-time robotics, stereo cameras are the
sensors of choice for depth perception as they are typically cheaper and more
versatile than their active counterparts. Their biggest drawback, however, is
that they do not directly sense depth maps; instead, these must be estimated
through data-intensive processes. Therefore, appropriate algorithm selection
plays an important role in achieving the desired performance characteristics.
Motivated by applications in space and mobile robotics, we implement and
evaluate a FPGA-accelerated adaptation of the ELAS algorithm. Despite offering
one of the best trade-offs between efficiency and accuracy, ELAS has only been
shown to run at 1.5-3 fps on a high-end CPU. Our system preserves all
intriguing properties of the original algorithm, such as the slanted plane
priors, but can achieve a frame rate of 47fps whilst consuming under 4W of
power. Unlike previous FPGA based designs, we take advantage of both components
on the CPU/FPGA System-on-Chip to showcase the strategy necessary to accelerate
more complex and computationally diverse algorithms for such low power,
real-time systems.Comment: 8 pages, 7 figures, 2 table
Staple: Complementary Learners for Real-Time Tracking
Correlation Filter-based trackers have recently achieved excellent
performance, showing great robustness to challenging situations exhibiting
motion blur and illumination changes. However, since the model that they learn
depends strongly on the spatial layout of the tracked object, they are
notoriously sensitive to deformation. Models based on colour statistics have
complementary traits: they cope well with variation in shape, but suffer when
illumination is not consistent throughout a sequence. Moreover, colour
distributions alone can be insufficiently discriminative. In this paper, we
show that a simple tracker combining complementary cues in a ridge regression
framework can operate faster than 80 FPS and outperform not only all entries in
the popular VOT14 competition, but also recent and far more sophisticated
trackers according to multiple benchmarks.Comment: To appear in CVPR 201
ROAM: a Rich Object Appearance Model with Application to Rotoscoping
Rotoscoping, the detailed delineation of scene elements through a video shot,
is a painstaking task of tremendous importance in professional post-production
pipelines. While pixel-wise segmentation techniques can help for this task,
professional rotoscoping tools rely on parametric curves that offer the artists
a much better interactive control on the definition, editing and manipulation
of the segments of interest. Sticking to this prevalent rotoscoping paradigm,
we propose a novel framework to capture and track the visual aspect of an
arbitrary object in a scene, given a first closed outline of this object. This
model combines a collection of local foreground/background appearance models
spread along the outline, a global appearance model of the enclosed object and
a set of distinctive foreground landmarks. The structure of this rich
appearance model allows simple initialization, efficient iterative optimization
with exact minimization at each step, and on-line adaptation in videos. We
demonstrate qualitatively and quantitatively the merit of this framework
through comparisons with tools based on either dynamic segmentation with a
closed curve or pixel-wise binary labelling
An evaluation of recent local image descriptors for real-world applications of image matching
This paper discusses and compares the best and most recent local descriptors, evaluating them on increasingly complex image matching tasks, encompassing planar and non-planar scenarios under severe viewpoint changes. This evaluation, aimed at assessing descriptor suitability for real-world applications, leverages the concept of approximated overlap error as a means to naturally extend to non-planar scenes the standard metric used for planar scenes. According to the evaluation results, most descriptors exhibit a gradual performance degradation in the transition from planar to non-planar scenes. The best descriptors are those capable of capturing well not only the local image context, but also the global scene structure. Data-driven approaches are shown to have reached the matching robustness and accuracy of the best hand-crafted descriptor
Study of Saiga Horn Using High-Performance Liquid Chromatography with Mass Spectrometry
The saiga horns have been investigated the using of modern analytic methods. High-performance liquid chromatography (HPLC) with mass-spectrometric (MS and MS/MS) detection and polyacrylamide gel electrophoresis (PAGE) were used. It could be concluded that basic proteins of the saiga horns are keratins and collagen. The basic representation protein in all samples is keratin type I microfibrillar (from sheep), keratin type II microfibrillar (from sheep), collagen type I (α1) (from bovine) and collagen type I (α2) (from bovine). Free amino acids we determined in all samples are nontreated by enzyme
DGPose: Deep Generative Models for Human Body Analysis
Deep generative modelling for human body analysis is an emerging problem with
many interesting applications. However, the latent space learned by such
approaches is typically not interpretable, resulting in less flexibility. In
this work, we present deep generative models for human body analysis in which
the body pose and the visual appearance are disentangled. Such a
disentanglement allows independent manipulation of pose and appearance, and
hence enables applications such as pose-transfer without specific training for
such a task. Our proposed models, the Conditional-DGPose and the Semi-DGPose,
have different characteristics. In the first, body pose labels are taken as
conditioners, from a fully-supervised training set. In the second, our
structured semi-supervised approach allows for pose estimation to be performed
by the model itself and relaxes the need for labelled data. Therefore, the
Semi-DGPose aims for the joint understanding and generation of people in
images. It is not only capable of mapping images to interpretable latent
representations but also able to map these representations back to the image
space. We compare our models with relevant baselines, the ClothNet-Body and the
Pose Guided Person Generation networks, demonstrating their merits on the
Human3.6M, ChictopiaPlus and DeepFashion benchmarks.Comment: IJCV 2020 special issue on 'Generating Realistic Visual Data of Human
Behavior' preprint. Keywords: deep generative models, semi-supervised
learning, human pose estimation, variational autoencoders, generative
adversarial network
Learning to Simulate Realistic LiDARs
Simulating realistic sensors is a challenging part in data generation for
autonomous systems, often involving carefully handcrafted sensor design, scene
properties, and physics modeling. To alleviate this, we introduce a pipeline
for data-driven simulation of a realistic LiDAR sensor. We propose a model that
learns a mapping between RGB images and corresponding LiDAR features such as
raydrop or per-point intensities directly from real datasets. We show that our
model can learn to encode realistic effects such as dropped points on
transparent surfaces or high intensity returns on reflective materials. When
applied to naively raycasted point clouds provided by off-the-shelf simulator
software, our model enhances the data by predicting intensities and removing
points based on the scene's appearance to match a real LiDAR sensor. We use our
technique to learn models of two distinct LiDAR sensors and use them to improve
simulated LiDAR data accordingly. Through a sample task of vehicle
segmentation, we show that enhancing simulated point clouds with our technique
improves downstream task performance.Comment: IROS2022 pape
- âŠ