237 research outputs found

    Development of the components of a low cost, distributed facial virtual conferencing system

    Get PDF
    This thesis investigates the development of a low cost, component based facial virtual conferencing system. The design is decomposed into an encoding phase and a decoding phase, which communicate with each other via a network connection. The encoding phase is composed of three components: model acquisition (which handles avatar generation), pose estimation and expression analysis. Audio is not considered part of the encoding and decoding process, and as such is not evaluated. The model acquisition component is implemented using a visual hull reconstruction algorithm that is able to reconstruct real-world objects using only sets of images of the object as input. The object to be reconstructed is assumed to lie in a bounding volume of voxels. The reconstruction process involves the following stages: - Space carving for basic shape extraction; - Isosurface extraction to remove voxels not part of the surface of the reconstruction; - Mesh connection to generate a closed, connected polyhedral mesh; - Texture generation. Texturing is achieved by Gouraud shading the reconstruction with a vertex colour map; - Mesh decimation to simplify the object. The original algorithm has complexity O(n), but suffers from an inability to reconstruct concave surfaces that do not form part of the visual hull of the object. A novel extension to this algorithm based on Normalised Cross Correlation (NCC) is proposed to overcome this problem. An extension to speed up traditional NCC evaluations is proposed which reduces the NCC search space from a 2D search problem down to a single evaluation. Pose estimation and expression analysis are performed by tracking six fiducial points on the face of a subject. A tracking algorithm is developed that uses Normalised Cross Correlation to facilitate robust tracking that is invariant to changing lighting conditions, rotations and scaling. Pose estimation involves the recovery of the head position and orientation through the tracking of the triangle formed by the subject's eyebrows and nose tip. A rule-based evaluation of points that are tracked around the subject's mouth forms the basis of the expression analysis. A user assisted feedback loop and caching mechanism is used to overcome tracking errors due to fast motion or occlusions. The NCC tracker is shown to achieve a tracking performance of 10 fps when tracking the six fiducial points. The decoding phase is divided into 3 tasks, namely: avatar movement, expression generation and expression management. Avatar movement is implemented using the base VR system. Expression generation is facilitated using a Vertex Interpolation Deformation method. A weighting system is proposed for expression management. Its function is to gradually transform from one expression to the next. The use of the vertex interpolation method allows real-time deformations of the avatar representation, achieving 16 fps when applied to a model consisting of 7500 vertices. An Expression Parameter Lookup Table (EPLT) facilitates an independent mapping between the two phases. It defines a list of generic expressions that are known to the system and associates an Expression ID with each one. For each generic expression, it relates the expression analysis rules for any subject with the expression generation parameters for any avatar model. The result is that facial expression replication between any subject and avatar combination can be performed by transferring only the Expression ID from the encoder application to the decoder application. The ideas developed in the thesis are demonstrated in an implementation using the CoRgi Virtual Reality system. It is shown that the virtual-conferencing application based on this design requires only a bandwidth of 2 Kbps.Adobe Acrobat Pro 9.4.6Adobe Acrobat 9.46 Paper Capture Plug-i

    Development of the components of a low cost, distributed facial virtual conferencing system

    Get PDF
    This thesis investigates the development of a low cost, component based facial virtual conferencing system. The design is decomposed into an encoding phase and a decoding phase, which communicate with each other via a network connection. The encoding phase is composed of three components: model acquisition (which handles avatar generation), pose estimation and expression analysis. Audio is not considered part of the encoding and decoding process, and as such is not evaluated. The model acquisition component is implemented using a visual hull reconstruction algorithm that is able to reconstruct real-world objects using only sets of images of the object as input. The object to be reconstructed is assumed to lie in a bounding volume of voxels. The reconstruction process involves the following stages: - Space carving for basic shape extraction; - Isosurface extraction to remove voxels not part of the surface of the reconstruction; - Mesh connection to generate a closed, connected polyhedral mesh; - Texture generation. Texturing is achieved by Gouraud shading the reconstruction with a vertex colour map; - Mesh decimation to simplify the object. The original algorithm has complexity O(n), but suffers from an inability to reconstruct concave surfaces that do not form part of the visual hull of the object. A novel extension to this algorithm based on Normalised Cross Correlation (NCC) is proposed to overcome this problem. An extension to speed up traditional NCC evaluations is proposed which reduces the NCC search space from a 2D search problem down to a single evaluation. Pose estimation and expression analysis are performed by tracking six fiducial points on the face of a subject. A tracking algorithm is developed that uses Normalised Cross Correlation to facilitate robust tracking that is invariant to changing lighting conditions, rotations and scaling. Pose estimation involves the recovery of the head position and orientation through the tracking of the triangle formed by the subject's eyebrows and nose tip. A rule-based evaluation of points that are tracked around the subject's mouth forms the basis of the expression analysis. A user assisted feedback loop and caching mechanism is used to overcome tracking errors due to fast motion or occlusions. The NCC tracker is shown to achieve a tracking performance of 10 fps when tracking the six fiducial points. The decoding phase is divided into 3 tasks, namely: avatar movement, expression generation and expression management. Avatar movement is implemented using the base VR system. Expression generation is facilitated using a Vertex Interpolation Deformation method. A weighting system is proposed for expression management. Its function is to gradually transform from one expression to the next. The use of the vertex interpolation method allows real-time deformations of the avatar representation, achieving 16 fps when applied to a model consisting of 7500 vertices. An Expression Parameter Lookup Table (EPLT) facilitates an independent mapping between the two phases. It defines a list of generic expressions that are known to the system and associates an Expression ID with each one. For each generic expression, it relates the expression analysis rules for any subject with the expression generation parameters for any avatar model. The result is that facial expression replication between any subject and avatar combination can be performed by transferring only the Expression ID from the encoder application to the decoder application. The ideas developed in the thesis are demonstrated in an implementation using the CoRgi Virtual Reality system. It is shown that the virtual-conferencing application based on this design requires only a bandwidth of 2 Kbps.Adobe Acrobat Pro 9.4.6Adobe Acrobat 9.46 Paper Capture Plug-i

    Exploitation of time-of-flight (ToF) cameras

    Get PDF
    This technical report reviews the state-of-the art in the field of ToF cameras, their advantages, their limitations, and their present-day applications sometimes in combination with other sensors. Even though ToF cameras provide neither higher resolution nor larger ambiguity-free range compared to other range map estimation systems, advantages such as registered depth and intensity data at a high frame rate, compact design, low weight and reduced power consumption have motivated their use in numerous areas of research. In robotics, these areas range from mobile robot navigation and map building to vision-based human motion capture and gesture recognition, showing particularly a great potential in object modeling and recognition.Preprin

    Geometric modeling of non-rigid 3D shapes : theory and application to object recognition.

    Get PDF
    One of the major goals of computer vision is the development of flexible and efficient methods for shape representation. This is true, especially for non-rigid 3D shapes where a great variety of shapes are produced as a result of deformations of a non-rigid object. Modeling these non-rigid shapes is a very challenging problem. Being able to analyze the properties of such shapes and describe their behavior is the key issue in research. Also, considering photometric features can play an important role in many shape analysis applications, such as shape matching and correspondence because it contains rich information about the visual appearance of real objects. This new information (contained in photometric features) and its important applications add another, new dimension to the problem\u27s difficulty. Two main approaches have been adopted in the literature for shape modeling for the matching and retrieval problem, local and global approaches. Local matching is performed between sparse points or regions of the shape, while the global shape approaches similarity is measured among entire models. These methods have an underlying assumption that shapes are rigidly transformed. And Most descriptors proposed so far are confined to shape, that is, they analyze only geometric and/or topological properties of 3D models. A shape descriptor or model should be isometry invariant, scale invariant, be able to capture the fine details of the shape, computationally efficient, and have many other good properties. A shape descriptor or model is needed. This shape descriptor should be: able to deal with the non-rigid shape deformation, able to handle the scale variation problem with less sensitivity to noise, able to match shapes related to the same class even if these shapes have missing parts, and able to encode both the photometric, and geometric information in one descriptor. This dissertation will address the problem of 3D non-rigid shape representation and textured 3D non-rigid shapes based on local features. Two approaches will be proposed for non-rigid shape matching and retrieval based on Heat Kernel (HK), and Scale-Invariant Heat Kernel (SI-HK) and one approach for modeling textured 3D non-rigid shapes based on scale-invariant Weighted Heat Kernel Signature (WHKS). For the first approach, the Laplace-Beltrami eigenfunctions is used to detect a small number of critical points on the shape surface. Then a shape descriptor is formed based on the heat kernels at the detected critical points for different scales. Sparse representation is used to reduce the dimensionality of the calculated descriptor. The proposed descriptor is used for classification via the Collaborative Representation-based Classification with a Regularized Least Square (CRC-RLS) algorithm. The experimental results have shown that the proposed descriptor can achieve state-of-the-art results on two benchmark data sets. For the second approach, an improved method to introduce scale-invariance has been also proposed to avoid noise-sensitive operations in the original transformation method. Then a new 3D shape descriptor is formed based on the histograms of the scale-invariant HK for a number of critical points on the shape at different time scales. A Collaborative Classification (CC) scheme is then employed for object classification. The experimental results have shown that the proposed descriptor can achieve high performance on the two benchmark data sets. An important observation from the experiments is that the proposed approach is more able to handle data under several distortion scenarios (noise, shot-noise, scale, and under missing parts) than the well-known approaches. For modeling textured 3D non-rigid shapes, this dissertation introduces, for the first time, a mathematical framework for the diffusion geometry on textured shapes. This dissertation presents an approach for shape matching and retrieval based on a weighted heat kernel signature. It shows how to include photometric information as a weight over the shape manifold, and it also propose a novel formulation for heat diffusion over weighted manifolds. Then this dissertation presents a new discretization method for the weighted heat kernel induced by the linear FEM weights. Finally, the weighted heat kernel signature is used as a shape descriptor. The proposed descriptor encodes both the photometric, and geometric information based on the solution of one equation. Finally, this dissertation proposes an approach for 3D face recognition based on the front contours of heat propagation over the face surface. The front contours are extracted automatically as heat is propagating starting from a detected set of landmarks. The propagation contours are used to successfully discriminate the various faces. The proposed approach is evaluated on the largest publicly available database of 3D facial images and successfully compared to the state-of-the-art approaches in the literature. This work can be extended to the problem of dense correspondence between non-rigid shapes. The proposed approaches with the properties of the Laplace-Beltrami eigenfunction can be utilized for 3D mesh segmentation. Another possible application of the proposed approach is the view point selection for 3D objects by selecting the most informative views that collectively provide the most descriptive presentation of the surface

    Groupwise non-rigid registration for automatic construction of appearance models of the human craniofacial complex for analysis, synthesis and simulation

    Get PDF
    Finally, a novel application of 3D appearance modelling is proposed: a faster than real-time algorithm for statistically constrained quasi-mechanical simulation. Experiments demonstrate superior realism, achieved in the proposed method by employing statistical appearance models to drive the simulation, in comparison with the comparable state-of-the-art quasi-mechanical approaches.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    THREE DIMENSIONAL MODELING AND ANIMATION OF FACIAL EXPRESSIONS

    Get PDF
    Facial expression and animation are important aspects of the 3D environment featuring human characters. These animations are frequently used in many kinds of applications and there have been many efforts to increase the realism. Three aspects are still stimulating active research: the detailed subtle facial expressions, the process of rigging a face, and the transfer of an expression from one person to another. This dissertation focuses on the above three aspects. A system for freely designing and creating detailed, dynamic, and animated facial expressions is developed. The presented pattern functions produce detailed and animated facial expressions. The system produces realistic results with fast performance, and allows users to directly manipulate it and see immediate results. Two unique methods for generating real-time, vivid, and animated tears have been developed and implemented. One method is for generating a teardrop that continually changes its shape as the tear drips down the face. The other is for generating a shedding tear, which is a kind of tear that seamlessly connects with the skin as it flows along the surface of the face, but remains an individual object. The methods both broaden CG and increase the realism of facial expressions. A new method to automatically set the bones on facial/head models to speed up the rigging process of a human face is also developed. To accomplish this, vertices that describe the face/head as well as relationships between each part of the face/head are grouped. The average distance between pairs of vertices is used to place the head bones. To set the bones in the face with multi-density, the mean value of the vertices in a group is measured. The time saved with this method is significant. A novel method to produce realistic expressions and animations by transferring an existing expression to a new facial model is developed. The approach is to transform the source model into the target model, which then has the same topology as the source model. The displacement vectors are calculated. Each vertex in the source model is mapped to the target model. The spatial relationships of each mapped vertex are constrained

    Image processing techniques for mixed reality and biometry

    Get PDF
    2013 - 2014This thesis work is focused on two applicative fields of image processing research, which, for different reasons, have become particularly active in the last decade: Mixed Reality and Biometry. Though the image processing techniques involved in these two research areas are often different, they share the key objective of recognizing salient features typically captured through imaging devices. Enabling technologies for augmented/mixed reality have been improved and refined throughout the last years and more recently they seems to have finally passed the demo stage to becoming ready for practical industrial and commercial applications. To this regard, a crucial role will likely be played by the new generation of smartphones and tablets, equipped with an arsenal of sensors connections and enough processing power for becoming the most portable and affordable AR platform ever. Within this context, techniques like gesture recognition by means of simple, light and robust capturing hardware and advanced computer vision techniques may play an important role in providing a natural and robust way to control software applications and to enhance onthe- field operational capabilities. The research described in this thesis is targeted toward advanced visualization and interaction strategies aimed to improve the operative range and robustness of mixed reality applications, particularly for demanding industrial environments... [edited by Author]XIII n.s

    Active modelling of virtual humans

    Get PDF
    This thesis provides a complete framework that enables the creation of photorealistic 3D human models in real-world environments. The approach allows a non-expert user to use any digital capture device to obtain four images of an individual and create a personalised 3D model, for multimedia applications. To achieve this, it is necessary that the system is automatic and that the reconstruction process is flexible to account for information that is not available or incorrectly captured. In this approach the individual is automatically extracted from the environment using constrained active B-spline templates that are scaled and automatically initialised using only image information. These templates incorporate the energy minimising framework for Active Contour Models, providing a suitable and flexible method to deal with the adjustments in pose an individual can adopt. The final states of the templates describe the individual’s shape. The contours in each view are combined to form a 3D B-spline surface that characterises an individual’s maximal silhouette equivalent. The surface provides a mould that contains sufficient information to allow for the active deformation of an underlying generic human model. This modelling approach is performed using a novel technique that evolves active-meshes to 3D for deforming the underlying human model, while adaptively constraining it to preserve its existing structure. The active-mesh approach incorporates internal constraints that maintain the structural relationship of the vertices of the human model, while external forces deform the model congruous to the 3D surface mould. The strength of the internal constraints can be reduced to allow the model to adopt the exact shape of the bounding volume or strengthened to preserve the internal structure, particularly in areas of high detail. This novel implementation provides a uniform framework that can be simply and automatically applied to the entire human model

    Groupwise non-rigid registration for automatic construction of appearance models of the human craniofacial complex for analysis, synthesis and simulation

    Get PDF
    Finally, a novel application of 3D appearance modelling is proposed: a faster than real-time algorithm for statistically constrained quasi-mechanical simulation. Experiments demonstrate superior realism, achieved in the proposed method by employing statistical appearance models to drive the simulation, in comparison with the comparable state-of-the-art quasi-mechanical approaches
    corecore