5,490 research outputs found
Joint segmentation of color and depth data based on splitting and merging driven by surface fitting
This paper proposes a segmentation scheme based on the joint usage of color and depth data together with a 3D surface estimation scheme. Firstly a set of multi-dimensional vectors is built from color, geometry and surface orientation information. Normalized cuts spectral clustering is then applied in order to recursively segment the scene in two parts thus obtaining an over-segmentation. This procedure is followed by a recursive merging stage where close segments belonging to the same object are joined together. At each step of both procedures a NURBS model is fitted on the computed segments and the accuracy of the fitting is used as a measure of the plausibility that a segment represents a single surface or object. By comparing the accuracy to the one at the previous step, it is possible to determine if each splitting or merging operation leads to a better scene representation and consequently whether to perform it or not. Experimental results show how the proposed method provides an accurate and reliable segmentation
The Application of Preconditioned Alternating Direction Method of Multipliers in Depth from Focal Stack
Post capture refocusing effect in smartphone cameras is achievable by using
focal stacks. However, the accuracy of this effect is totally dependent on the
combination of the depth layers in the stack. The accuracy of the extended
depth of field effect in this application can be improved significantly by
computing an accurate depth map which has been an open issue for decades. To
tackle this issue, in this paper, a framework is proposed based on
Preconditioned Alternating Direction Method of Multipliers (PADMM) for depth
from the focal stack and synthetic defocus application. In addition to its
ability to provide high structural accuracy and occlusion handling, the
optimization function of the proposed method can, in fact, converge faster and
better than state of the art methods. The evaluation has been done on 21 sets
of focal stacks and the optimization function has been compared against 5 other
methods. Preliminary results indicate that the proposed method has a better
performance in terms of structural accuracy and optimization in comparison to
the current state of the art methods.Comment: 15 pages, 8 figure
What Is Around The Camera?
How much does a single image reveal about the environment it was taken in? In
this paper, we investigate how much of that information can be retrieved from a
foreground object, combined with the background (i.e. the visible part of the
environment). Assuming it is not perfectly diffuse, the foreground object acts
as a complexly shaped and far-from-perfect mirror. An additional challenge is
that its appearance confounds the light coming from the environment with the
unknown materials it is made of. We propose a learning-based approach to
predict the environment from multiple reflectance maps that are computed from
approximate surface normals. The proposed method allows us to jointly model the
statistics of environments and material properties. We train our system from
synthesized training data, but demonstrate its applicability to real-world
data. Interestingly, our analysis shows that the information obtained from
objects made out of multiple materials often is complementary and leads to
better performance.Comment: Accepted to ICCV. Project:
http://homes.esat.kuleuven.be/~sgeorgou/multinatillum
LiveCap: Real-time Human Performance Capture from Monocular Video
We present the first real-time human performance capture approach that
reconstructs dense, space-time coherent deforming geometry of entire humans in
general everyday clothing from just a single RGB video. We propose a novel
two-stage analysis-by-synthesis optimization whose formulation and
implementation are designed for high performance. In the first stage, a skinned
template model is jointly fitted to background subtracted input video, 2D and
3D skeleton joint positions found using a deep neural network, and a set of
sparse facial landmark detections. In the second stage, dense non-rigid 3D
deformations of skin and even loose apparel are captured based on a novel
real-time capable algorithm for non-rigid tracking using dense photometric and
silhouette constraints. Our novel energy formulation leverages automatically
identified material regions on the template to model the differing non-rigid
deformation behavior of skin and apparel. The two resulting non-linear
optimization problems per-frame are solved with specially-tailored
data-parallel Gauss-Newton solvers. In order to achieve real-time performance
of over 25Hz, we design a pipelined parallel architecture using the CPU and two
commodity GPUs. Our method is the first real-time monocular approach for
full-body performance capture. Our method yields comparable accuracy with
off-line performance capture techniques, while being orders of magnitude
faster
Spatio-temporal Video Parsing for Abnormality Detection
Abnormality detection in video poses particular challenges due to the
infinite size of the class of all irregular objects and behaviors. Thus no (or
by far not enough) abnormal training samples are available and we need to find
abnormalities in test data without actually knowing what they are.
Nevertheless, the prevailing concept of the field is to directly search for
individual abnormal local patches or image regions independent of another. To
address this problem, we propose a method for joint detection of abnormalities
in videos by spatio-temporal video parsing. The goal of video parsing is to
find a set of indispensable normal spatio-temporal object hypotheses that
jointly explain all the foreground of a video, while, at the same time, being
supported by normal training samples. Consequently, we avoid a direct detection
of abnormalities and discover them indirectly as those hypotheses which are
needed for covering the foreground without finding an explanation for
themselves by normal samples. Abnormalities are localized by MAP inference in a
graphical model and we solve it efficiently by formulating it as a convex
optimization problem. We experimentally evaluate our approach on several
challenging benchmark sets, improving over the state-of-the-art on all standard
benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table
- …