1,776 research outputs found

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi

    Collision Detection and Merging of Deformable B-Spline Surfaces in Virtual Reality Environment

    Get PDF
    This thesis presents a computational framework for representing, manipulating and merging rigid and deformable freeform objects in virtual reality (VR) environment. The core algorithms for collision detection, merging, and physics-based modeling used within this framework assume that all 3D deformable objects are B-spline surfaces. The interactive design tool can be represented as a B-spline surface, an implicit surface or a point, to allow the user a variety of rigid or deformable tools. The collision detection system utilizes the fact that the blending matrices used to discretize the B-spline surface are independent of the position of the control points and, therefore, can be pre-calculated. Complex B-spline surfaces can be generated by merging various B-spline surface patches using the B-spline surface patches merging algorithm presented in this thesis. Finally, the physics-based modeling system uses the mass-spring representation to determine the deformation and the reaction force values provided to the user. This helps to simulate realistic material behaviour of the model and assist the user in validating the design before performing extensive product detailing or finite element analysis using commercially available CAD software. The novelty of the proposed method stems from the pre-calculated blending matrices used to generate the points for graphical rendering, collision detection, merging of B-spline patches, and nodes for the mass spring system. This approach reduces computational time by avoiding the need to solve complex equations for blending functions of B-splines and perform the inversion of large matrices. This alternative approach to the mechanical concept design will also help to do away with the need to build prototypes for conceptualization and preliminary validation of the idea thereby reducing the time and cost of concept design phase and the wastage of resources

    NASA: Neural Articulated Shape Approximation

    Full text link
    Efficient representation of articulated objects such as human bodies is an important problem in computer vision and graphics. To efficiently simulate deformation, existing approaches represent 3D objects using polygonal meshes and deform them using skinning techniques. This paper introduces neural articulated shape approximation (NASA), an alternative framework that enables efficient representation of articulated deformable objects using neural indicator functions that are conditioned on pose. Occupancy testing using NASA is straightforward, circumventing the complexity of meshes and the issue of water-tightness. We demonstrate the effectiveness of NASA for 3D tracking applications, and discuss other potential extensions.Comment: ECCV 202

    Revisión de literatura de jerarquía volúmenes acotantes enfocados en detección de colisiones

    Get PDF
    (Eng) A bounding volume is a common method to simplify object representation by using the composition of geometrical shapes that enclose the object; it encapsulates complex objects by means of simple volumes and it is widely useful in collision detection applications and ray tracing for rendering algorithms. They are popular in computer graphics and computational geometry. Most popular bounding volumes are spheres, Oriented-Bounding Boxe s (OBB’ s), Axis-Align ed Bound ing Boxes (AABB’ s); moreover , the literature review includes ellipsoids, cylinders, sphere packing, sphere shells , k-DOP’ s, convex hulls, cloud of points, and minimal bounding boxe s, among others. A Bounding Volume Hierarchy is ussualy a tree in which the complete object is represented thigter fitting every level of the hierarchy. Additionally, each bounding volume has a cost associated to construction, update, and interference te ts. For instance, spheres are invariant to rotation and translations, then they do not require being updated ; their constructions and interference tests are more straightforward then OBB’ s; however, their tightness is lower than other bounding volumes. Finally , three comparisons between two polyhedra; seven different algorithms were used, of which five are public libraries for collision detection.(Spa) Un volumen acotante es un método común para simplificar la representación de los objetos por medio de composición de formas geométricas que encierran el objeto; estos encapsulan objetos complejos por medio de volúmenes simples y son ampliamente usados en aplicaciones de detección de colisiones y trazador de rayos para algoritmos de renderización. Los volúmenes acotantes son populares en computación gráfica y en geometría computacional; los más populares son las esferas, las cajas acotantes orientadas (OBB’s) y las cajas acotantes alineadas a los ejes (AABB’s); no obstante, la literatura incluye elipses, cilindros empaquetamiento de esferas, conchas de esferas, k-DOP’s, convex hulls, nubes de puntos y cajas acotantes mínimas, entre otras. Una jerarquía de volúmenes acotantes es usualmente un árbol, en el cual la representación de los objetos es más ajustada en cada uno de los niveles de la jerarquía. Adicionalmente, cada volumen acotante tiene asociado costos de construcción, actualización, pruebas de interferencia. Por ejemplo, las esferas so invariantes a rotación y translación, por lo tanto no requieren ser actualizadas en comparación con los AABB no son invariantes a la rotación. Por otro lado la construcción y las pruebas de solapamiento de las esferas son más simples que los OBB’s; sin embargo, el ajuste de las esferas es menor que otros volúmenes acotantes. Finalmente, se comparan dos poliedros con siete algoritmos diferentes de los cuales cinco son librerías públicas para detección de colisiones

    MonoPerfCap: Human Performance Capture from Monocular Video

    Full text link
    We present the first marker-less approach for temporally coherent 3D performance capture of a human with general clothing from monocular video. Our approach reconstructs articulated human skeleton motion as well as medium-scale non-rigid surface deformations in general scenes. Human performance capture is a challenging problem due to the large range of articulation, potentially fast motion, and considerable non-rigid deformations, even from multi-view data. Reconstruction from monocular video alone is drastically more challenging, since strong occlusions and the inherent depth ambiguity lead to a highly ill-posed reconstruction problem. We tackle these challenges by a novel approach that employs sparse 2D and 3D human pose detections from a convolutional neural network using a batch-based pose estimation strategy. Joint recovery of per-batch motion allows to resolve the ambiguities of the monocular reconstruction problem based on a low dimensional trajectory subspace. In addition, we propose refinement of the surface geometry based on fully automatically extracted silhouettes to enable medium-scale non-rigid alignment. We demonstrate state-of-the-art performance capture results that enable exciting applications such as video editing and free viewpoint video, previously infeasible from monocular video. Our qualitative and quantitative evaluation demonstrates that our approach significantly outperforms previous monocular methods in terms of accuracy, robustness and scene complexity that can be handled.Comment: Accepted to ACM TOG 2018, to be presented on SIGGRAPH 201
    • …
    corecore