8,235 research outputs found
Learning to Personalize in Appearance-Based Gaze Tracking
Personal variations severely limit the performance of appearance-based gaze
tracking. Adapting to these variations using standard neural network model
adaptation methods is difficult. The problems range from overfitting, due to
small amounts of training data, to underfitting, due to restrictive model
architectures. We tackle these problems by introducing the SPatial Adaptive
GaZe Estimator (SPAZE). By modeling personal variations as a low-dimensional
latent parameter space, SPAZE provides just enough adaptability to capture the
range of personal variations without being prone to overfitting. Calibrating
SPAZE for a new person reduces to solving a small optimization problem. SPAZE
achieves an error of 2.70 degrees with 9 calibration samples on MPIIGaze,
improving on the state-of-the-art by 14 %. We contribute to gaze tracking
research by empirically showing that personal variations are well-modeled as a
3-dimensional latent parameter space for each eye. We show that this
low-dimensionality is expected by examining model-based approaches to gaze
tracking. We also show that accurate head pose-free gaze tracking is possible
MScMS-II: an innovative IR-based indoor coordinate measuring system for large-scale metrology applications
According to the current great interest concerning large-scale metrology applications in many different fields of manufacturing industry, technologies and techniques for dimensional measurement have recently shown a substantial improvement. Ease-of-use, logistic and economic issues, as well as metrological performance are assuming a more and more important role among system requirements. This paper describes the architecture and the working principles of a novel infrared (IR) optical-based system, designed to perform low-cost and easy indoor coordinate measurements of large-size objects. The system consists of a distributed network-based layout, whose modularity allows fitting differently sized and shaped working volumes by adequately increasing the number of sensing units. Differently from existing spatially distributed metrological instruments, the remote sensor devices are intended to provide embedded data elaboration capabilities, in order to share the overall computational load. The overall system functionalities, including distributed layout configuration, network self-calibration, 3D point localization, and measurement data elaboration, are discussed. A preliminary metrological characterization of system performance, based on experimental testing, is also presente
Autocalibration with the Minimum Number of Cameras with Known Pixel Shape
In 3D reconstruction, the recovery of the calibration parameters of the
cameras is paramount since it provides metric information about the observed
scene, e.g., measures of angles and ratios of distances. Autocalibration
enables the estimation of the camera parameters without using a calibration
device, but by enforcing simple constraints on the camera parameters. In the
absence of information about the internal camera parameters such as the focal
length and the principal point, the knowledge of the camera pixel shape is
usually the only available constraint. Given a projective reconstruction of a
rigid scene, we address the problem of the autocalibration of a minimal set of
cameras with known pixel shape and otherwise arbitrarily varying intrinsic and
extrinsic parameters. We propose an algorithm that only requires 5 cameras (the
theoretical minimum), thus halving the number of cameras required by previous
algorithms based on the same constraint. To this purpose, we introduce as our
basic geometric tool the six-line conic variety (SLCV), consisting in the set
of planes intersecting six given lines of 3D space in points of a conic. We
show that the set of solutions of the Euclidean upgrading problem for three
cameras with known pixel shape can be parameterized in a computationally
efficient way. This parameterization is then used to solve autocalibration from
five or more cameras, reducing the three-dimensional search space to a
two-dimensional one. We provide experiments with real images showing the good
performance of the technique.Comment: 19 pages, 14 figures, 7 tables, J. Math. Imaging Vi
Study and Characterization of a Camera-based Distributed System for Large-Volume Dimensional Metrology Applications
Large-Volume Dimensional Metrology (LVDM) deals with dimensional inspection of large objects with dimensions in the order of tens up to hundreds of meters. Typical large volume dimensional metrology applications concern the assembly/disassembly phase of large objects, referring to industrial engineering. Based on different technologies and measurement principles, a wealth of LVDM systems have been proposed and developed in the literature, just to name a few, e.g., optical based systems such as laser tracker, laser radar, and mechanical based systems such as gantry CMM and multi-joints artificial arm CMM, and so on. Basically, the main existing LVDM systems can be divided into two categories, i.e. centralized systems and distributed systems, according to the scheme of hardware configuration. By definition, a centralized system is a stand-alone unit which works independently to provide measurements of a spatial point, while a distributed system, is defined as a system that consists of a series of sensors which work cooperatively to provide measurements of a spatial point, and usually individual sensor cannot measure the coordinates separately. Some representative distributed systems in the literature are iGPS, MScMS-II, and etc. The current trend of LVDM systems seem to orient towards distributed systems, and actually, distributed systems demonstrate many advantages that distinguish themselves from conventional centralized systems
Camera Planning and Fusion in a Heterogeneous Camera Network
Wide-area camera networks are becoming more and more common. They have widerange of commercial and military applications from video surveillance to smart home and from traffic monitoring to anti-terrorism. The design of such a camera network is a challenging problem due to the complexity of the environment, self and mutual occlusion of moving objects, diverse sensor properties and a myriad of performance metrics for different applications. In this dissertation, we consider two such challenges: camera planing and camera fusion. Camera planning is to determine the optimal number and placement of cameras for a target cost function. Camera fusion describes the task of combining images collected by heterogenous cameras in the network to extract information pertinent to a target application.
I tackle the camera planning problem by developing a new unified framework based on binary integer programming (BIP) to relate the network design parameters and the performance goals of a variety of camera network tasks. Most of the BIP formulations are NP hard problems and various approximate algorithms have been proposed in the literature. In this dissertation, I develop a comprehensive framework in comparing the entire spectrum of approximation algorithms from Greedy, Markov Chain Monte Carlo (MCMC) to various relaxation techniques. The key contribution is to provide not only a generic formulation of the camera planning problem but also novel approaches to adapt the formulation to powerful approximation schemes including Simulated Annealing (SA) and Semi-Definite Program (SDP). The accuracy, efficiency and scalability of each technique are analyzed and compared in depth. Extensive experimental results are provided to illustrate the strength and weakness of each method.
The second problem of heterogeneous camera fusion is a very complex problem. Information can be fused at different levels from pixel or voxel to semantic objects, with large variation in accuracy, communication and computation costs. My focus is on the geometric transformation of shapes between objects observed at different camera planes. This so-called the geometric fusion approach usually provides the most reliable fusion approach at the expense of high computation and communication costs. To tackle the complexity, a hierarchy of camera models with different levels of complexity was proposed to balance the effectiveness and efficiency of the camera network operation. Then different calibration and registration methods are proposed for each camera model. At last, I provide two specific examples to demonstrate the effectiveness of the model: 1)a fusion system to improve the segmentation of human body in a camera network consisted of thermal and regular visible light cameras and 2) a view dependent rendering system by combining the information from depth and regular cameras to collecting the scene information and generating new views in real time
Why Having 10,000 Parameters in Your Camera Model is Better Than Twelve
Camera calibration is an essential first step in setting up 3D Computer
Vision systems. Commonly used parametric camera models are limited to a few
degrees of freedom and thus often do not optimally fit to complex real lens
distortion. In contrast, generic camera models allow for very accurate
calibration due to their flexibility. Despite this, they have seen little use
in practice. In this paper, we argue that this should change. We propose a
calibration pipeline for generic models that is fully automated, easy to use,
and can act as a drop-in replacement for parametric calibration, with a focus
on accuracy. We compare our results to parametric calibrations. Considering
stereo depth estimation and camera pose estimation as examples, we show that
the calibration error acts as a bias on the results. We thus argue that in
contrast to current common practice, generic models should be preferred over
parametric ones whenever possible. To facilitate this, we released our
calibration pipeline at https://github.com/puzzlepaint/camera_calibration,
making both easy-to-use and accurate camera calibration available to everyone.Comment: 15 pages, 12 figures, accepted to CVPR 2020 as an ora
- …