392 research outputs found
Capturing Hands in Action using Discriminative Salient Points and Physics Simulation
Hand motion capture is a popular research field, recently gaining more
attention due to the ubiquity of RGB-D sensors. However, even most recent
approaches focus on the case of a single isolated hand. In this work, we focus
on hands that interact with other hands or objects and present a framework that
successfully captures motion in such interaction scenarios for both rigid and
articulated objects. Our framework combines a generative model with
discriminatively trained salient points to achieve a low tracking error and
with collision detection and physics simulation to achieve physically plausible
estimates even in case of occlusions and missing visual data. Since all
components are unified in a single objective function which is almost
everywhere differentiable, it can be optimized with standard optimization
techniques. Our approach works for monocular RGB-D sequences as well as setups
with multiple synchronized RGB cameras. For a qualitative and quantitative
evaluation, we captured 29 sequences with a large variety of interactions and
up to 150 degrees of freedom.Comment: Accepted for publication by the International Journal of Computer
Vision (IJCV) on 16.02.2016 (submitted on 17.10.14). A combination into a
single framework of an ECCV'12 multicamera-RGB and a monocular-RGBD GCPR'14
hand tracking paper with several extensions, additional experiments and
detail
Learned Vertex Descent: A New Direction for 3D Human Model Fitting
We propose a novel optimization-based paradigm for 3D human model fitting on
images and scans. In contrast to existing approaches that directly regress the
parameters of a low-dimensional statistical body model (e.g. SMPL) from input
images, we train an ensemble of per-vertex neural fields network. The network
predicts, in a distributed manner, the vertex descent direction towards the
ground truth, based on neural features extracted at the current vertex
projection. At inference, we employ this network, dubbed LVD, within a
gradient-descent optimization pipeline until its convergence, which typically
occurs in a fraction of a second even when initializing all vertices into a
single point. An exhaustive evaluation demonstrates that our approach is able
to capture the underlying body of clothed people with very different body
shapes, achieving a significant improvement compared to state-of-the-art. LVD
is also applicable to 3D model fitting of humans and hands, for which we show a
significant improvement to the SOTA with a much simpler and faster method.Comment: Project page: https://www.iri.upc.edu/people/ecorona/lvd
Recommended from our members
Surface camera (SCAM) Light Field Rendering
In this article we present a new variant of the light field representation that supports improved image reconstruction by accommodating sparse correspondence information. This places our representation somewhere between a pure, two-plane parameterized, light field and a lumigraph representation, with its continuous geometric proxy. Our approach factors the rays of a light field into one of two separate classes. All rays consistent with a given correspondence are implicitly represented using a new auxiliary data structure, which we call a surface camera, or scam. The remaining rays of the light field are represented using a standard two-plane parameterized light field. We present an efficient rendering algorithm that combines ray samples from scams with those from the light field. The resulting image reconstructions are noticeably improved over that of a pure light field.Engineering and Applied Science
Influence of metallic artifact filtering on MEG signals for source localization during interictal epileptiform activity
Objective. Medical intractable epilepsy is a common condition that affects 40% of epileptic patients that generally have to undergo resective surgery. Magnetoencephalography (MEG) has been increasingly used to identify the epileptogenic foci through equivalent current dipole (ECD) modeling, one of the most accepted methods to obtain an accurate localization of interictal epileptiform discharges (IEDs). Modeling requires that MEG signals are adequately preprocessed to reduce interferences, a task that has been greatly improved by the use of blind source separation (BSS) methods. MEG recordings are highly sensitive to metallic interferences originated inside the head by implanted intracranial electrodes, dental prosthesis, etc and also coming from external sources such as pacemakers or vagal stimulators. To reduce these artifacts, a BSS-based fully automatic procedure was recently developed and validated, showing an effective reduction of metallic artifacts in simulated and real signals (Migliorelli et al 2015 J. Neural Eng. 12 046001). The main objective of this study was to evaluate its effects in the detection of IEDs and ECD modeling of patients with focal epilepsy and metallic interference. Approach. A comparison between the resulting positions of ECDs was performed: without removing metallic interference; rejecting only channels with large metallic artifacts; and after BSS-based reduction. Measures of dispersion and distance of ECDs were defined to analyze the results. Main results. The relationship between the artifact-to-signal ratio and ECD fitting showed that higher values of metallic interference produced highly scattered dipoles. Results revealed a significant reduction on dispersion using the BSS-based reduction procedure, yielding feasible locations of ECDs in contrast to the other two approaches. Significance. The automatic BSS-based method can be applied to MEG datasets affected by metallic artifacts as a processing step to improve the localization of epileptic foci.Postprint (published version
Two-View Geometry Scoring Without Correspondences
Camera pose estimation for two-view geometry traditionally relies on RANSAC.
Normally, a multitude of image correspondences leads to a pool of proposed
hypotheses, which are then scored to find a winning model. The inlier count is
generally regarded as a reliable indicator of "consensus". We examine this
scoring heuristic, and find that it favors disappointing models under certain
circumstances. As a remedy, we propose the Fundamental Scoring Network (FSNet),
which infers a score for a pair of overlapping images and any proposed
fundamental matrix. It does not rely on sparse correspondences, but rather
embodies a two-view geometry model through an epipolar attention mechanism that
predicts the pose error of the two images. FSNet can be incorporated into
traditional RANSAC loops. We evaluate FSNet on fundamental and essential matrix
estimation on indoor and outdoor datasets, and establish that FSNet can
successfully identify good poses for pairs of images with few or unreliable
correspondences. Besides, we show that naively combining FSNet with MAGSAC++
scoring approach achieves state of the art results
Neural network approximated Bayesian inference of edge electron density profiles at JET
A neural network (NN) has been trained on the inference of the edge electron density profiles from
measurements of the JET lithium beam emission spectroscopy (Li-BES) diagnostic. The novelty of the
approach resides in the fact that the network has been trained to be a fast surrogate model of an existing
Bayesian model of the diagnostic implemented within the Minerva framework. Previous work showed
the very first application of this method to an x-ray imaging diagnostic at the W7-X experiment, and it
was argued that the method was general enough that it may be applied to different physics systems.
Here, we try to show that the claim made there is valid. What makes the approach general and versatile
is the common definition of different models within the same framework. The network is tested on data
measured during several different pulses and the predictions compared to the results obtained with the
full model Bayesian inference. The NN analysis only requires tens of microseconds on a GPU
compared to the tens of minutes long full inference. Finally, in relation to what was presented in the
previous work, we demonstrate an improvement in the method of calculation of the network
uncertainties, achieved by using a state-of-the-art deep learning technique based on a variational
inference interpretation of the network training. The advantage of this calculation resides in the fact that
it relies on fewer assumptions, and no extra computation time is required besides the conventional
network evaluation time. This allows estimating the uncertainties also in real time applications.Comunidad Europea de la Energía Atómica. EURATOM - 2014-2018 y 2019-2020 - 63305
B-CLEAN-SC: CLEAN-SC for broadband sources
This paper presents B-CLEAN-SC, a variation of CLEAN-SC for broadband
sources. Opposed to CLEAN-SC, which ``deconvolves'' the beamforming map for
each frequency individually, B-CLEAN-SC processes frequency intervals. Instead
of performing a deconvolution iteration at the location of the maximum level,
B-CLEAN-SC performs it at the location of the over-frequency-averaged maximum
to improve the location estimation. The method is validated and compared to
standard CLEAN-SC on synthetic cases, and real-world experiments, for broad-
and narrowband sources. It improves the source reconstruction at low and high
frequencies and suppresses noise, while it only increases the need for memory
but not computational effort.Comment: revision
Two-View Geometry Scoring Without Correspondences
Camera pose estimation for two-view geometry traditionally relies on RANSAC. Normally, a multitude of image correspondences leads to a pool of proposed hypotheses, which are then scored to find a winning model. The inlier count is generally regarded as a reliable indicator of 'consensus'. We examine this scoring heuristic, and find that it favors disappointing models under certain circumstances. As a remedy, we propose the Fundamental Scoring Network (FSNet), which infers a score for a pair of overlap-ping images and any proposed fundamental matrix. It does not rely on sparse correspondences, but rather embodies a two-view geometry model through an epipolar attention mechanism that predicts the pose error of the two images. FSNet can be incorporated into traditional RANSAC loops. We evaluate FSNet onfundamental and essential matrix estimation on indoor and outdoor datasets, and establish that FSNet can successfully identify good poses for pairs of images with few or unreliable correspondences. Besides, we show that naively combining FSNet with MAGSAC++ scoring approach achieves state of the art results
- …