49,788 research outputs found
Probabilistic RGB-D Odometry based on Points, Lines and Planes Under Depth Uncertainty
This work proposes a robust visual odometry method for structured
environments that combines point features with line and plane segments,
extracted through an RGB-D camera. Noisy depth maps are processed by a
probabilistic depth fusion framework based on Mixtures of Gaussians to denoise
and derive the depth uncertainty, which is then propagated throughout the
visual odometry pipeline. Probabilistic 3D plane and line fitting solutions are
used to model the uncertainties of the feature parameters and pose is estimated
by combining the three types of primitives based on their uncertainties.
Performance evaluation on RGB-D sequences collected in this work and two public
RGB-D datasets: TUM and ICL-NUIM show the benefit of using the proposed depth
fusion framework and combining the three feature-types, particularly in scenes
with low-textured surfaces, dynamic objects and missing depth measurements.Comment: Major update: more results, depth filter released as opensource, 34
page
CNN based dense underwater 3D scene reconstruction by transfer learning using bubble database
Dense 3D shape acquisition of swimming human or live fish is an important
research topic for sports, biological science and so on. For this purpose,
active stereo sensor is usually used in the air, however it cannot be applied
to the underwater environment because of refraction, strong light attenuation
and severe interference of bubbles. Passive stereo is a simple solution for
capturing dynamic scenes at underwater environment, however the shape with
textureless surfaces or irregular reflections cannot be recovered. Recently,
the stereo camera pair with a pattern projector for adding artificial textures
on the objects is proposed. However, to use the system for underwater
environment, several problems should be compensated, i.e., disturbance by
fluctuation and bubbles. Simple solution is to use convolutional neural network
for stereo to cancel the effects of bubbles and/or water fluctuation. Since it
is not easy to train CNN with small size of database with large variation, we
develop a special bubble generation device to efficiently create real bubble
database of multiple size and density. In addition, we propose a transfer
learning technique for multi-scale CNN to effectively remove bubbles and
projected-patterns on the object. Further, we develop a real system and
actually captured live swimming human, which has not been done before.
Experiments are conducted to show the effectiveness of our method compared with
the state of the art techniques.Comment: IEEE Winter Conference on Applications of Computer Vision. arXiv
admin note: text overlap with arXiv:1808.0834
Facial Expressions Tracking and Recognition: Database Protocols for Systems Validation and Evaluation
Each human face is unique. It has its own shape, topology, and distinguishing
features. As such, developing and testing facial tracking systems are
challenging tasks. The existing face recognition and tracking algorithms in
Computer Vision mainly specify concrete situations according to particular
goals and applications, requiring validation methodologies with data that fits
their purposes. However, a database that covers all possible variations of
external and factors does not exist, increasing researchers' work in acquiring
their own data or compiling groups of databases.
To address this shortcoming, we propose a methodology for facial data
acquisition through definition of fundamental variables, such as subject
characteristics, acquisition hardware, and performance parameters. Following
this methodology, we also propose two protocols that allow the capturing of
facial behaviors under uncontrolled and real-life situations. As validation, we
executed both protocols which lead to creation of two sample databases: FdMiee
(Facial database with Multi input, expressions, and environments) and FACIA
(Facial Multimodal database driven by emotional induced acting).
Using different types of hardware, FdMiee captures facial information under
environmental and facial behaviors variations. FACIA is an extension of FdMiee
introducing a pipeline to acquire additional facial behaviors and speech using
an emotion-acting method. Therefore, this work eases the creation of adaptable
database according to algorithm's requirements and applications, leading to
simplified validation and testing processes.Comment: 10 pages, 6 images, Computers & Graphic
Towards Coordinated Bandwidth Adaptations for Hundred-Scale 3D Tele-Immersive Systems
3D tele-immersion improves the state of collaboration among geographically
distributed participants. Unlike the traditional 2D videos, a 3D tele-immersive
system employs multiple 3D cameras based in each physical site to cover a much
larger field of view, generating a very large amount of stream data. One of the
major challenges is how to efficiently transmit these bulky 3D streaming data
to bandwidth-constrained sites. In this paper, we study an adaptive Human
Visual System (HVS) -compliant bandwidth management framework for efficient
delivery of hundred-scale streams produced from distributed 3D tele-immersive
sites to a receiver site with limited bandwidth budget. Our adaptation
framework exploits the semantics link of HVS with multiple 3D streams in the 3D
tele-immersive environment. We developed TELEVIS, a visual simulation tool to
showcase a HVS-aware tele-immersive system for realistic cases. Our evaluation
results show that the proposed adaptation can improve the total quality per
unit of bandwidth used to deliver streams in 3D tele-immersive systems.Comment: Springer Multimedia Systems Journal, 14 pages, March 201
Learning High Dynamic Range from Outdoor Panoramas
Outdoor lighting has extremely high dynamic range. This makes the process of
capturing outdoor environment maps notoriously challenging since special
equipment must be used. In this work, we propose an alternative approach. We
first capture lighting with a regular, LDR omnidirectional camera, and aim to
recover the HDR after the fact via a novel, learning-based inverse tonemapping
method. We propose a deep autoencoder framework which regresses linear, high
dynamic range data from non-linear, saturated, low dynamic range panoramas. We
validate our method through a wide set of experiments on synthetic data, as
well as on a novel dataset of real photographs with ground truth. Our approach
finds applications in a variety of settings, ranging from outdoor light capture
to image matching.Comment: 8 pages + 2 pages of citations, 10 figures. Accepted as an oral paper
at ICCV 201
SNR-based adaptive acquisition method for fast Fourier ptychographic microscopy
Fourier ptychographic microscopy (FPM) is a computational imaging technique
with both high resolution and large field-of-view. However, the effective
numerical aperture (NA) achievable with a typical LED panel is ambiguous and
usually relies on the repeated tests of different illumination NAs. The imaging
quality of each raw image usually depends on the visual assessments, which is
subjective and inaccurate especially for those dark field images. Moreover, the
acquisition process is really time-consuming.In this paper, we propose a
SNR-based adaptive acquisition method for quantitative evaluation and adaptive
collection of each raw image according to the signal-to-noise ration (SNR)
value, to improve the FPM's acquisition efficiency and automatically obtain the
maximum achievable NA, reducing the time of collection, storage and subsequent
calculation. The widely used EPRY-FPM algorithm is applied without adding any
algorithm complexity and computational burden. The performance has been
demonstrated in both USAF targets and biological samples with different imaging
sensors respectively, which have either Poisson or Gaussian noises model.
Further combined with the sparse LEDs strategy, the number of collection images
can be shorten to around 25 frames while the former needs 361 images, the
reduction ratio can reach over 90%. This method will make FPM more practical
and automatic, and can also be used in different configurations of FPM.Comment: 11 pages, 6 figure
Fully-automatic inverse tone mapping algorithm based on dynamic mid-level tone mapping
High Dynamic Range (HDR) displays can show images with higher color contrast levels and peak luminosities than the common Low Dynamic Range (LDR) displays. However, most existing video content is recorded and/or graded in LDR format. To show LDR content on HDR displays, it needs to be up-scaled using a so-called inverse tone mapping algorithm. Several techniques for inverse tone mapping have been proposed in the last years, going from simple approaches based on global and local operators to more advanced algorithms such as neural networks. Some of the drawbacks of existing techniques for inverse tone mapping are the need for human intervention, the high computation time for more advanced algorithms, limited low peak brightness, and the lack of the preservation of the artistic intentions. In this paper, we propose a fully-automatic inverse tone mapping operator based on mid-level mapping capable of real-time video processing. Our proposed algorithm allows expanding LDR images into HDR images with peak brightness over 1000 nits, preserving the artistic intentions inherent to the HDR domain. We assessed our results using the full-reference objective quality metrics HDR-VDP-2.2 and DRIM, and carrying out a subjective pair-wise comparison experiment. We compared our results with those obtained with the most recent methods found in the literature. Experimental results demonstrate that our proposed method outperforms the current state-of-the-art of simple inverse tone mapping methods and its performance is similar to other more complex and time-consuming advanced techniques
Snapshot Difference Imaging using Time-of-Flight Sensors
Computational photography encompasses a diversity of imaging techniques, but
one of the core operations performed by many of them is to compute image
differences. An intuitive approach to computing such differences is to capture
several images sequentially and then process them jointly. Usually, this
approach leads to artifacts when recording dynamic scenes. In this paper, we
introduce a snapshot difference imaging approach that is directly implemented
in the sensor hardware of emerging time-of-flight cameras. With a variety of
examples, we demonstrate that the proposed snapshot difference imaging
technique is useful for direct-global illumination separation, for direct
imaging of spatial and temporal image gradients, for direct depth edge imaging,
and more
Statistical methods for the quantitative genetic analysis of high-throughput phenotyping data
The advent of plant phenomics, coupled with the wealth of genotypic data
generated by next-generation sequencing technologies, provides exciting new
resources for investigations into and improvement of complex traits. However,
these new technologies also bring new challenges in quantitative genetics,
namely, a need for the development of robust frameworks that can accommodate
these high-dimensional data. In this chapter, we describe methods for the
statistical analysis of high-throughput phenotyping (HTP) data with the goal of
enhancing the prediction accuracy of genomic selection (GS). Following the
Introduction in Section 1, Section 2 discusses field-based HTP, including the
use of unmanned aerial vehicles and light detection and ranging, as well as how
we can achieve increased genetic gain by utilizing image data derived from HTP.
Section 3 considers extending commonly used GS models to integrate HTP data as
covariates associated with the principal trait response, such as yield.
Particular focus is placed on single-trait, multi-trait, and genotype by
environment interaction models. One unique aspect of HTP data is that phenomics
platforms often produce large-scale data with high spatial and temporal
resolution for capturing dynamic growth, development, and stress responses.
Section 4 discusses the utility of a random regression model for performing
longitudinal GS. The chapter concludes with a discussion of some standing
issues
Feature-based visual odometry prior for real-time semi-dense stereo SLAM
Robust and fast motion estimation and mapping is a key prerequisite for
autonomous operation of mobile robots. The goal of performing this task solely
on a stereo pair of video cameras is highly demanding and bears conflicting
objectives: on one hand, the motion has to be tracked fast and reliably, on the
other hand, high-level functions like navigation and obstacle avoidance depend
crucially on a complete and accurate environment representation. In this work,
we propose a two-layer approach for visual odometry and SLAM with stereo
cameras that runs in real-time and combines feature-based matching with
semi-dense direct image alignment. Our method initializes semi-dense depth
estimation, which is computationally expensive, from motion that is tracked by
a fast but robust keypoint-based method. Experiments on public benchmark and
proprietary datasets show that our approach is faster than state-of-the-art
methods without losing accuracy and yields comparable map building
capabilities. Moreover, our approach is shown to handle large inter-frame
motion and illumination changes much more robustly than its direct
counterparts
- …