664 research outputs found

    Editing faces in videos

    Get PDF
    Editing faces in movies is of interest in the special effects industry. We aim at producing effects such as the addition of accessories interacting correctly with the face or replacing the face of a stuntman with the face of the main actor. The system introduced in this thesis is based on a 3D generative face model. Using a 3D model makes it possible to edit the face in the semantic space of pose, expression, and identity instead of pixel space, and due to its 3D nature allows a modelling of the light interaction. In our system we first reconstruct the 3D face, which is deforming because of expressions and speech, the lighting, and the camera in all frames of a monocular input video. The face is then edited by substituting expressions or identities with those of another video sequence or by adding virtual objects into the scene. The manipulated 3D scene is rendered back into the original video, correctly simulating the interaction of the light with the deformed face and virtual objects. We describe all steps necessary to build and apply the system. This includes registration of training faces to learn a generative face model, semi-automatic annotation of the input video, fitting of the face model to the input video, editing of the fit, and rendering of the resulting scene. While describing the application we introduce a host of new methods, each of which is of interest on its own. We start with a new method to register 3D face scans to use as training data for the face model. For video preprocessing a new interest point tracking and 2D Active Appearance Model fitting technique is proposed. For robust fitting we introduce background modelling, model-based stereo techniques, and a more accurate light model

    Kinetic depth effect and identification of shape.

    Get PDF

    Modelling and tracking objects with a topology preserving self-organising neural network

    Get PDF
    Human gestures form an integral part in our everyday communication. We use gestures not only to reinforce meaning, but also to describe the shape of objects, to play games, and to communicate in noisy environments. Vision systems that exploit gestures are often limited by inaccuracies inherent in handcrafted models. These models are generated from a collection of training examples which requires segmentation and alignment. Segmentation in gesture recognition typically involves manual intervention, a time consuming process that is feasible only for a limited set of gestures. Ideally gesture models should be automatically acquired via a learning scheme that enables the acquisition of detailed behavioural knowledge only from topological and temporal observation. The research described in this thesis is motivated by a desire to provide a framework for the unsupervised acquisition and tracking of gesture models. In any learning framework, the initialisation of the shapes is very crucial. Hence, it would be beneficial to have a robust model not prone to noise that can automatically correspond the set of shapes. In the first part of this thesis, we develop a framework for building statistical 2D shape models by extracting, labelling and corresponding landmark points using only topological relations derived from competitive hebbian learning. The method is based on the assumption that correspondences can be addressed as an unsupervised classification problem where landmark points are the cluster centres (nodes) in a high-dimensional vector space. The approach is novel in that the network can be used in cases where the topological structure of the input pattern is not known a priori thus no topology of fixed dimensionality is imposed onto the network. In the second part, we propose an approach to minimise the user intervention in the adaptation process, which requires to specify a priori the number of nodes needed to represent an object, by utilising an automatic criterion for maximum node growth. Furthermore, this model is used to represent motion in image sequences by initialising a suitable segmentation that separates the object of interest from the background. The segmentation system takes into consideration some illumination tolerance, images as inputs from ordinary cameras and webcams, some low to medium cluttered background avoiding extremely cluttered backgrounds, and that the objects are at close range from the camera. In the final part, we extend the framework for the automatic modelling and unsupervised tracking of 2D hand gestures in a sequence of k frames. The aim is to use the tracked frames as training examples in order to build the model and maintain correspondences. To do that we add an active step to the Growing Neural Gas (GNG) network, which we call Active Growing Neural Gas (A-GNG) that takes into consideration not only the geometrical position of the nodes, but also the underlined local feature structure of the image, and the distance vector between successive images. The quality of our model is measured through the calculation of the topographic product. The topographic product is our topology preserving measure which quantifies the neighbourhood preservation. In our system we have applied specific restrictions in the velocity and the appearance of the gestures to simplify the difficulty of the motion analysis in the gesture representation. The proposed framework has been validated on applications related to sign language. The work has great potential in Virtual Reality (VR) applications where the learning and the representation of gestures becomes natural without the need of expensive wear cable sensors

    Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies

    Get PDF
    In motion analysis and understanding it is important to be able to fit a suitable model or structure to the temporal series of observed data, in order to describe motion patterns in a compact way, and to discriminate between them. In an unsupervised context, i.e., no prior model of the moving object(s) is available, such a structure has to be learned from the data in a bottom-up fashion. In recent times, volumetric approaches in which the motion is captured from a number of cameras and a voxel-set representation of the body is built from the camera views, have gained ground due to attractive features such as inherent view-invariance and robustness to occlusions. Automatic, unsupervised segmentation of moving bodies along entire sequences, in a temporally-coherent and robust way, has the potential to provide a means of constructing a bottom-up model of the moving body, and track motion cues that may be later exploited for motion classification. Spectral methods such as locally linear embedding (LLE) can be useful in this context, as they preserve "protrusions", i.e., high-curvature regions of the 3D volume, of articulated shapes, while improving their separation in a lower dimensional space, making them in this way easier to cluster. In this paper we therefore propose a spectral approach to unsupervised and temporally-coherent body-protrusion segmentation along time sequences. Volumetric shapes are clustered in an embedding space, clusters are propagated in time to ensure coherence, and merged or split to accommodate changes in the body's topology. Experiments on both synthetic and real sequences of dense voxel-set data are shown. This supports the ability of the proposed method to cluster body-parts consistently over time in a totally unsupervised fashion, its robustness to sampling density and shape quality, and its potential for bottom-up model constructionComment: 31 pages, 26 figure

    Markerless deformation capture of hoverfly wings using multiple calibrated cameras

    Get PDF
    This thesis introduces an algorithm for the automated deformation capture of hoverfly wings from multiple camera image sequences. The algorithm is capable of extracting dense surface measurements, without the aid of fiducial markers, over an arbitrary number of wingbeats of hovering flight and requires limited manual initialisation. A novel motion prediction method, called the ‘normalised stroke model’, makes use of the similarity of adjacent wing strokes to predict wing keypoint locations, which are then iteratively refined in a stereo image registration procedure. Outlier removal, wing fitting and further refinement using independently reconstructed boundary points complete the algorithm. It was tested on two hovering data sets, as well as a challenging flight manoeuvre. By comparing the 3-d positions of keypoints extracted from these surfaces with those resulting from manual identification, the accuracy of the algorithm is shown to approach that of a fully manual approach. In particular, half of the algorithm-extracted keypoints were within 0.17mm of manually identified keypoints, approximately equal to the error of the manual identification process. This algorithm is unique among purely image based flapping flight studies in the level of automation it achieves, and its generality would make it applicable to wing tracking of other insects

    A java framework for object detection and tracking, 2007

    Get PDF
    Object detection and tracking is an important problem in the automated analysis of video. There have been numerous approaches and technological advances for object detection and tracking in the video analysis. As one of the most challenging and active research areas, more algorithms will be proposed in the future. As a consequence, there will be the demand for the capability to provide a system that can effectively collect, organize, group, document and implement these approaches. The purpose of this thesis is to develop one uniform object detection and tracking framework, capable of detecting and tracking the multi-objects in the presence of occlusion. The object detection and tracking algorithms are classified into different categories and incorporated into the framework implemented in Java. The framework can adapt to different types, and different application domains, and be easy and convenient for developers to reuse. It also provides comprehensive descriptions of representative methods in each category and some examples to aspire to give developers or users, who require a tracker for a certain application, the ability to select the most suitable tracking algorithm for their particular needs

    Analysis of contrast-enhanced medical images.

    Get PDF
    Early detection of human organ diseases is of great importance for the accurate diagnosis and institution of appropriate therapies. This can potentially prevent progression to end-stage disease by detecting precursors that evaluate organ functionality. In addition, it also assists the clinicians for therapy evaluation, tracking diseases progression, and surgery operations. Advances in functional and contrast-enhanced (CE) medical images enabled accurate noninvasive evaluation of organ functionality due to their ability to provide superior anatomical and functional information about the tissue-of-interest. The main objective of this dissertation is to develop a computer-aided diagnostic (CAD) system for analyzing complex data from CE magnetic resonance imaging (MRI). The developed CAD system has been tested in three case studies: (i) early detection of acute renal transplant rejection, (ii) evaluation of myocardial perfusion in patients with ischemic heart disease after heart attack; and (iii), early detection of prostate cancer. However, developing a noninvasive CAD system for the analysis of CE medical images is subject to multiple challenges, including, but are not limited to, image noise and inhomogeneity, nonlinear signal intensity changes of the images over the time course of data acquisition, appearances and shape changes (deformations) of the organ-of-interest during data acquisition, determination of the best features (indexes) that describe the perfusion of a contrast agent (CA) into the tissue. To address these challenges, this dissertation focuses on building new mathematical models and learning techniques that facilitate accurate analysis of CAs perfusion in living organs and include: (i) accurate mathematical models for the segmentation of the object-of-interest, which integrate object shape and appearance features in terms of pixel/voxel-wise image intensities and their spatial interactions; (ii) motion correction techniques that combine both global and local models, which exploit geometric features, rather than image intensities to avoid problems associated with nonlinear intensity variations of the CE images; (iii) fusion of multiple features using the genetic algorithm. The proposed techniques have been integrated into CAD systems that have been tested in, but not limited to, three clinical studies. First, a noninvasive CAD system is proposed for the early and accurate diagnosis of acute renal transplant rejection using dynamic contrast-enhanced MRI (DCE-MRI). Acute rejection–the immunological response of the human immune system to a foreign kidney–is the most sever cause of renal dysfunction among other diagnostic possibilities, including acute tubular necrosis and immune drug toxicity. In the U.S., approximately 17,736 renal transplants are performed annually, and given the limited number of donors, transplanted kidney salvage is an important medical concern. Thus far, biopsy remains the gold standard for the assessment of renal transplant dysfunction, but only as the last resort because of its invasive nature, high cost, and potential morbidity rates. The diagnostic results of the proposed CAD system, based on the analysis of 50 independent in-vivo cases were 96% with a 95% confidence interval. These results clearly demonstrate the promise of the proposed image-based diagnostic CAD system as a supplement to the current technologies, such as nuclear imaging and ultrasonography, to determine the type of kidney dysfunction. Second, a comprehensive CAD system is developed for the characterization of myocardial perfusion and clinical status in heart failure and novel myoregeneration therapy using cardiac first-pass MRI (FP-MRI). Heart failure is considered the most important cause of morbidity and mortality in cardiovascular disease, which affects approximately 6 million U.S. patients annually. Ischemic heart disease is considered the most common underlying cause of heart failure. Therefore, the detection of the heart failure in its earliest forms is essential to prevent its relentless progression to premature death. While current medical studies focus on detecting pathological tissue and assessing contractile function of the diseased heart, this dissertation address the key issue of the effects of the myoregeneration therapy on the associated blood nutrient supply. Quantitative and qualitative assessment in a cohort of 24 perfusion data sets demonstrated the ability of the proposed framework to reveal regional perfusion improvements with therapy, and transmural perfusion differences across the myocardial wall; thus, it can aid in follow-up on treatment for patients undergoing the myoregeneration therapy. Finally, an image-based CAD system for early detection of prostate cancer using DCE-MRI is introduced. Prostate cancer is the most frequently diagnosed malignancy among men and remains the second leading cause of cancer-related death in the USA with more than 238,000 new cases and a mortality rate of about 30,000 in 2013. Therefore, early diagnosis of prostate cancer can improve the effectiveness of treatment and increase the patient’s chance of survival. Currently, needle biopsy is the gold standard for the diagnosis of prostate cancer. However, it is an invasive procedure with high costs and potential morbidity rates. Additionally, it has a higher possibility of producing false positive diagnosis due to relatively small needle biopsy samples. Application of the proposed CAD yield promising results in a cohort of 30 patients that would, in the near future, represent a supplement of the current technologies to determine prostate cancer type. The developed techniques have been compared to the state-of-the-art methods and demonstrated higher accuracy as shown in this dissertation. The proposed models (higher-order spatial interaction models, shape models, motion correction models, and perfusion analysis models) can be used in many of today’s CAD applications for early detection of a variety of diseases and medical conditions, and are expected to notably amplify the accuracy of CAD decisions based on the automated analysis of CE images
    • …
    corecore