282 research outputs found
Editing faces in videos
Editing faces in movies is of interest in the special effects industry. We aim at
producing effects such as the addition of accessories interacting correctly with
the face or replacing the face of a stuntman with the face of the main actor.
The system introduced in this thesis is based on a 3D generative face model.
Using a 3D model makes it possible to edit the face in the semantic space of pose,
expression, and identity instead of pixel space, and due to its 3D nature allows
a modelling of the light interaction. In our system we first reconstruct the 3D
face, which is deforming because of expressions and speech, the lighting, and
the camera in all frames of a monocular input video. The face is then edited by
substituting expressions or identities with those of another video sequence or by
adding virtual objects into the scene. The manipulated 3D scene is rendered back
into the original video, correctly simulating the interaction of the light with the
deformed face and virtual objects.
We describe all steps necessary to build and apply the system. This includes
registration of training faces to learn a generative face model, semi-automatic
annotation of the input video, fitting of the face model to the input video, editing
of the fit, and rendering of the resulting scene.
While describing the application we introduce a host of new methods, each
of which is of interest on its own. We start with a new method to register 3D
face scans to use as training data for the face model. For video preprocessing a
new interest point tracking and 2D Active Appearance Model fitting technique
is proposed. For robust fitting we introduce background modelling, model-based
stereo techniques, and a more accurate light model
Modelling and tracking objects with a topology preserving self-organising neural network
Human gestures form an integral part in our everyday communication. We use
gestures not only to reinforce meaning, but also to describe the shape of objects,
to play games, and to communicate in noisy environments. Vision systems that
exploit gestures are often limited by inaccuracies inherent in handcrafted models.
These models are generated from a collection of training examples which requires
segmentation and alignment. Segmentation in gesture recognition typically involves manual intervention, a time consuming process that is feasible only for a
limited set of gestures. Ideally gesture models should be automatically acquired
via a learning scheme that enables the acquisition of detailed behavioural knowledge only from topological and temporal observation.
The research described in this thesis is motivated by a desire to provide a framework for the unsupervised acquisition and tracking of gesture models. In any
learning framework, the initialisation of the shapes is very crucial. Hence, it would
be beneficial to have a robust model not prone to noise that can automatically correspond the set of shapes. In the first part of this thesis, we develop a framework
for building statistical 2D shape models by extracting, labelling and corresponding
landmark points using only topological relations derived from competitive hebbian learning. The method is based on the assumption that correspondences can
be addressed as an unsupervised classification problem where landmark points
are the cluster centres (nodes) in a high-dimensional vector space. The approach
is novel in that the network can be used in cases where the topological structure of
the input pattern is not known a priori thus no topology of fixed dimensionality is imposed onto the network.
In the second part, we propose an approach to minimise the user intervention
in the adaptation process, which requires to specify a priori the number of nodes
needed to represent an object, by utilising an automatic criterion for maximum
node growth. Furthermore, this model is used to represent motion in image sequences by initialising a suitable segmentation that separates the object of interest
from the background. The segmentation system takes into consideration some illumination tolerance, images as inputs from ordinary cameras and webcams, some
low to medium cluttered background avoiding extremely cluttered backgrounds,
and that the objects are at close range from the camera.
In the final part, we extend the framework for the automatic modelling and
unsupervised tracking of 2D hand gestures in a sequence of k frames. The aim
is to use the tracked frames as training examples in order to build the model and
maintain correspondences. To do that we add an active step to the Growing Neural Gas (GNG) network, which we call Active Growing Neural Gas (A-GNG) that
takes into consideration not only the geometrical position of the nodes, but also the
underlined local feature structure of the image, and the distance vector between
successive images. The quality of our model is measured through the calculation
of the topographic product. The topographic product is our topology preserving
measure which quantifies the neighbourhood preservation.
In our system we have applied specific restrictions in the velocity and the appearance of the gestures to simplify the difficulty of the motion analysis in the gesture representation. The proposed framework has been validated on applications
related to sign language. The work has great potential in Virtual Reality (VR) applications where the learning and the representation of gestures becomes natural
without the need of expensive wear cable sensors
Automatic Spatiotemporal Analysis of Cardiac Image Series
RÉSUMÉ
Ă€ ce jour, les maladies cardiovasculaires demeurent au premier rang des principales causes de
décès en Amérique du Nord. Chez l’adulte et au sein de populations de plus en plus jeunes,
la soi-disant épidémie d’obésité entraînée par certaines habitudes de vie tels que la mauvaise
alimentation, le manque d’exercice et le tabagisme est lourde de conséquences pour les personnes
affectées, mais aussi sur le système de santé. La principale cause de morbidité et de
mortalité chez ces patients est l’athérosclérose, une accumulation de plaque à l’intérieur des
vaisseaux sanguins à hautes pressions telles que les artères coronaires. Les lésions athérosclérotiques
peuvent entraîner l’ischémie en bloquant la circulation sanguine et/ou en provoquant
une thrombose. Cela mène souvent à de graves conséquences telles qu’un infarctus. Outre les
problèmes liés à la sténose, les parois artérielles des régions criblées de plaque augmentent la
rigidité des parois vasculaires, ce qui peut aggraver la condition du patient. Dans la population
pédiatrique, la pathologie cardiovasculaire acquise la plus fréquente est la maladie de
Kawasaki. Il s’agit d’une vasculite aigüe pouvant affecter l’intégrité structurale des parois des
artères coronaires et mener à la formation d’anévrismes. Dans certains cas, ceux-ci entravent
l’hémodynamie artérielle en engendrant une perfusion myocardique insuffisante et en activant
la formation de thromboses.
Le diagnostic de ces deux maladies coronariennes sont traditionnellement effectués à l’aide
d’angiographies par fluoroscopie. Pendant ces examens paracliniques, plusieurs centaines de
projections radiographiques sont acquises en séries suite à l’infusion artérielle d’un agent de
contraste. Ces images révèlent la lumière des vaisseaux sanguins et la présence de lésions
potentiellement pathologiques, s’il y a lieu. Parce que les séries acquises contiennent de l’information
très dynamique en termes de mouvement du patient volontaire et involontaire (ex.
battements cardiaques, respiration et déplacement d’organes), le clinicien base généralement
son interprétation sur une seule image angiographique où des mesures géométriques sont effectuées
manuellement ou semi-automatiquement par un technicien en radiologie. Bien que
l’angiographie par fluoroscopie soit fréquemment utilisé partout dans le monde et souvent
considéré comme l’outil de diagnostic “gold-standard” pour de nombreuses maladies vasculaires,
la nature bidimensionnelle de cette modalité d’imagerie est malheureusement très
limitante en termes de spécification géométrique des différentes régions pathologiques. En effet,
la structure tridimensionnelle des sténoses et des anévrismes ne peut pas être pleinement
appréciée en 2D car les caractéristiques observées varient selon la configuration angulaire de
l’imageur. De plus, la présence de lésions affectant les artères coronaires peut ne pas refléter
la véritable santé du myocarde, car des mécanismes compensatoires naturels (ex. vaisseaux----------ABSTRACT
Cardiovascular disease continues to be the leading cause of death in North America. In adult
and, alarmingly, ever younger populations, the so-called obesity epidemic largely driven by
lifestyle factors that include poor diet, lack of exercise and smoking, incurs enormous stresses
on the healthcare system. The primary cause of serious morbidity and mortality for these
patients is atherosclerosis, the build up of plaque inside high pressure vessels like the coronary
arteries. These lesions can lead to ischemic disease and may progress to precarious blood
flow blockage or thrombosis, often with infarction or other severe consequences. Besides
the stenosis-related outcomes, the arterial walls of plaque-ridden regions manifest increased
stiffness, which may exacerbate negative patient prognosis. In pediatric populations, the
most prevalent acquired cardiovascular pathology is Kawasaki disease. This acute vasculitis
may affect the structural integrity of coronary artery walls and progress to aneurysmal lesions.
These can hinder the blood flow’s hemodynamics, leading to inadequate downstream
perfusion, and may activate thrombus formation which may lead to precarious prognosis.
Diagnosing these two prominent coronary artery diseases is traditionally performed using
fluoroscopic angiography. Several hundred serial x-ray projections are acquired during selective
arterial infusion of a radiodense contrast agent, which reveals the vessels’ luminal
area and possible pathological lesions. The acquired series contain highly dynamic information
on voluntary and involuntary patient movement: respiration, organ displacement and
heartbeat, for example. Current clinical analysis is largely limited to a single angiographic
image where geometrical measures will be performed manually or semi-automatically by a
radiological technician. Although widely used around the world and generally considered
the gold-standard diagnosis tool for many vascular diseases, the two-dimensional nature of
this imaging modality is limiting in terms of specifying the geometry of various pathological
regions. Indeed, the 3D structures of stenotic or aneurysmal lesions may not be fully appreciated
in 2D because their observable features are dependent on the angular configuration of
the imaging gantry. Furthermore, the presence of lesions in the coronary arteries may not
reflect the true health of the myocardium, as natural compensatory mechanisms may obviate
the need for further intervention. In light of this, cardiac magnetic resonance perfusion
imaging is increasingly gaining attention and clinical implementation, as it offers a direct
assessment of myocardial tissue viability following infarction or suspected coronary artery
disease. This type of modality is plagued, however, by motion similar to that present in fluoroscopic
imaging. This issue predisposes clinicians to laborious manual intervention in order
to align anatomical structures in sequential perfusion frames, thus hindering automation o
HIGH QUALITY HUMAN 3D BODY MODELING, TRACKING AND APPLICATION
Geometric reconstruction of dynamic objects is a fundamental task of computer vision and graphics, and modeling human body of high fidelity is considered to be a core of this problem. Traditional human shape and motion capture techniques require an array of surrounding cameras or subjects wear reflective markers, resulting in a limitation of working space and portability. In this dissertation, a complete process is designed from geometric modeling detailed 3D human full body and capturing shape dynamics over time using a flexible setup to guiding clothes/person re-targeting with such data-driven models. As the mechanical movement of human body can be considered as an articulate motion, which is easy to guide the skin animation but has difficulties in the reverse process to find parameters from images without manual intervention, we present a novel parametric model, GMM-BlendSCAPE, jointly taking both linear skinning model and the prior art of BlendSCAPE (Blend Shape Completion and Animation for PEople) into consideration and develop a Gaussian Mixture Model (GMM) to infer both body shape and pose from incomplete observations. We show the increased accuracy of joints and skin surface estimation using our model compared to the skeleton based motion tracking. To model the detailed body, we start with capturing high-quality partial 3D scans by using a single-view commercial depth camera. Based on GMM-BlendSCAPE, we can then reconstruct multiple complete static models of large pose difference via our novel non-rigid registration algorithm. With vertex correspondences established, these models can be further converted into a personalized drivable template and used for robust pose tracking in a similar GMM framework. Moreover, we design a general purpose real-time non-rigid deformation algorithm to accelerate this registration. Last but not least, we demonstrate a novel virtual clothes try-on application based on our personalized model utilizing both image and depth cues to synthesize and re-target clothes for single-view videos of different people
Automatic Spatiotemporal Analysis of Cardiac Image Series
RÉSUMÉ
Ă€ ce jour, les maladies cardiovasculaires demeurent au premier rang des principales causes de
décès en Amérique du Nord. Chez l’adulte et au sein de populations de plus en plus jeunes,
la soi-disant épidémie d’obésité entraînée par certaines habitudes de vie tels que la mauvaise
alimentation, le manque d’exercice et le tabagisme est lourde de conséquences pour les personnes
affectées, mais aussi sur le système de santé. La principale cause de morbidité et de
mortalité chez ces patients est l’athérosclérose, une accumulation de plaque à l’intérieur des
vaisseaux sanguins à hautes pressions telles que les artères coronaires. Les lésions athérosclérotiques
peuvent entraîner l’ischémie en bloquant la circulation sanguine et/ou en provoquant
une thrombose. Cela mène souvent à de graves conséquences telles qu’un infarctus. Outre les
problèmes liés à la sténose, les parois artérielles des régions criblées de plaque augmentent la
rigidité des parois vasculaires, ce qui peut aggraver la condition du patient. Dans la population
pédiatrique, la pathologie cardiovasculaire acquise la plus fréquente est la maladie de
Kawasaki. Il s’agit d’une vasculite aigüe pouvant affecter l’intégrité structurale des parois des
artères coronaires et mener à la formation d’anévrismes. Dans certains cas, ceux-ci entravent
l’hémodynamie artérielle en engendrant une perfusion myocardique insuffisante et en activant
la formation de thromboses.
Le diagnostic de ces deux maladies coronariennes sont traditionnellement effectués à l’aide
d’angiographies par fluoroscopie. Pendant ces examens paracliniques, plusieurs centaines de
projections radiographiques sont acquises en séries suite à l’infusion artérielle d’un agent de
contraste. Ces images révèlent la lumière des vaisseaux sanguins et la présence de lésions
potentiellement pathologiques, s’il y a lieu. Parce que les séries acquises contiennent de l’information
très dynamique en termes de mouvement du patient volontaire et involontaire (ex.
battements cardiaques, respiration et déplacement d’organes), le clinicien base généralement
son interprétation sur une seule image angiographique où des mesures géométriques sont effectuées
manuellement ou semi-automatiquement par un technicien en radiologie. Bien que
l’angiographie par fluoroscopie soit fréquemment utilisé partout dans le monde et souvent
considéré comme l’outil de diagnostic “gold-standard” pour de nombreuses maladies vasculaires,
la nature bidimensionnelle de cette modalité d’imagerie est malheureusement très
limitante en termes de spécification géométrique des différentes régions pathologiques. En effet,
la structure tridimensionnelle des sténoses et des anévrismes ne peut pas être pleinement
appréciée en 2D car les caractéristiques observées varient selon la configuration angulaire de
l’imageur. De plus, la présence de lésions affectant les artères coronaires peut ne pas refléter
la véritable santé du myocarde, car des mécanismes compensatoires naturels (ex. vaisseaux----------ABSTRACT
Cardiovascular disease continues to be the leading cause of death in North America. In adult
and, alarmingly, ever younger populations, the so-called obesity epidemic largely driven by
lifestyle factors that include poor diet, lack of exercise and smoking, incurs enormous stresses
on the healthcare system. The primary cause of serious morbidity and mortality for these
patients is atherosclerosis, the build up of plaque inside high pressure vessels like the coronary
arteries. These lesions can lead to ischemic disease and may progress to precarious blood
flow blockage or thrombosis, often with infarction or other severe consequences. Besides
the stenosis-related outcomes, the arterial walls of plaque-ridden regions manifest increased
stiffness, which may exacerbate negative patient prognosis. In pediatric populations, the
most prevalent acquired cardiovascular pathology is Kawasaki disease. This acute vasculitis
may affect the structural integrity of coronary artery walls and progress to aneurysmal lesions.
These can hinder the blood flow’s hemodynamics, leading to inadequate downstream
perfusion, and may activate thrombus formation which may lead to precarious prognosis.
Diagnosing these two prominent coronary artery diseases is traditionally performed using
fluoroscopic angiography. Several hundred serial x-ray projections are acquired during selective
arterial infusion of a radiodense contrast agent, which reveals the vessels’ luminal
area and possible pathological lesions. The acquired series contain highly dynamic information
on voluntary and involuntary patient movement: respiration, organ displacement and
heartbeat, for example. Current clinical analysis is largely limited to a single angiographic
image where geometrical measures will be performed manually or semi-automatically by a
radiological technician. Although widely used around the world and generally considered
the gold-standard diagnosis tool for many vascular diseases, the two-dimensional nature of
this imaging modality is limiting in terms of specifying the geometry of various pathological
regions. Indeed, the 3D structures of stenotic or aneurysmal lesions may not be fully appreciated
in 2D because their observable features are dependent on the angular configuration of
the imaging gantry. Furthermore, the presence of lesions in the coronary arteries may not
reflect the true health of the myocardium, as natural compensatory mechanisms may obviate
the need for further intervention. In light of this, cardiac magnetic resonance perfusion
imaging is increasingly gaining attention and clinical implementation, as it offers a direct
assessment of myocardial tissue viability following infarction or suspected coronary artery
disease. This type of modality is plagued, however, by motion similar to that present in fluoroscopic
imaging. This issue predisposes clinicians to laborious manual intervention in order
to align anatomical structures in sequential perfusion frames, thus hindering automation o
The Probabilistic Active Shape Model: From Model Construction to Flexible Medical Image Segmentation
Automatic processing of three-dimensional image data acquired with computed tomography or magnetic resonance imaging plays an increasingly important role in medicine. For example, the automatic
segmentation of anatomical structures in tomographic images allows to generate three-dimensional visualizations of a patient’s anatomy and thereby supports surgeons during planning of various kinds of
surgeries.
Because organs in medical images often exhibit a low contrast to adjacent structures, and because the image quality may be hampered by noise or other image acquisition artifacts, the development of segmentation algorithms that are both robust and accurate is very challenging. In order to increase the robustness, the use of model-based algorithms is mandatory, as for example algorithms that incorporate prior knowledge about an organ’s shape into the segmentation process. Recent research has proven that Statistical Shape Models are especially appropriate for robust medical image segmentation. In these models, the typical shape of an organ is learned from a set of training examples. However, Statistical Shape Models have two major disadvantages: The construction of the models is relatively difficult, and the models are often used too restrictively, such that the resulting segmentation does not delineate the organ exactly.
This thesis addresses both problems: The first part of the thesis introduces new methods for establishing correspondence between training shapes, which is a necessary prerequisite for shape model learning. The developed methods include consistent parameterization algorithms for organs with spherical and genus 1 topology, as well as a nonrigid mesh registration algorithm for shapes with arbitrary topology. The second part of the thesis presents a new shape model-based segmentation algorithm that allows for an accurate delineation of organs. In contrast to existing approaches, it is possible to integrate not only linear shape models into the algorithm, but also nonlinear shape models, which allow for a more specific description of an organ’s shape variation.
The proposed segmentation algorithm is evaluated in three applications to medical image data: Liver and vertebra segmentation in contrast-enhanced computed tomography scans, and prostate segmentation in magnetic resonance images
Fast Elastic Registration of Soft Tissues under Large Deformations
International audienceA fast and accurate fusion of intra-operative images with a pre-operative data is a key component of computer-aided interventions which aim at improving the outcomes of the intervention while reducing the patient's discomfort. In this paper, we focus on the problematic of the intra-operative navigation during abdominal surgery, which requires an accurate registration of tissues undergoing large deformations. Such a scenario occurs in the case of partial hepatectomy: to facilitate the access to the pathology, e.g. a tumor located in the posterior part of the right lobe, the surgery is performed on a patient in lateral position. Due to the change in patient's position, the resection plan based on the pre-operative CT scan acquired in the supine position must be updated to account for the deformations. We suppose that an imaging modality, such as the cone-beam CT, provides the information about the intra-operative shape of an organ, however, due to the reduced radiation dose and contrast, the actual locations of the internal structures necessary to update the planning are not available. To this end, we propose a method allowing for fast registration of the pre-operative data represented by a detailed 3D model of the liver and its internal structure and the actual configuration given by the organ surface extracted from the intra-operative image. The algorithm behind the method combines the iterative closest point technique with a biomechanical model based on a co-rotational formulation of linear elasticity which accounts for large deformations of the tissue. The performance, robustness and accuracy of the method is quantitatively assessed on a control semi-synthetic dataset with known ground truth and a real dataset composed of nine pairs of abdominal CT scans acquired in supine and flank positions. It is shown that the proposed surface-matching method is capable of reducing the target registration error evaluated of the internal structures of the organ from more than 40 mm to less then 10 mm. Moreover, the control data is used to demonstrate the compatibility of the method with intra-operative clinical scenario, while the real datasets are utilized to study the impact of parametrization on the accuracy of the method. The method is also compared to a state-of-the art intensity-based registration technique in terms of accuracy and performance
Doctor of Philosophy
dissertationShape analysis is a well-established tool for processing surfaces. It is often a first step in performing tasks such as segmentation, symmetry detection, and finding correspondences between shapes. Shape analysis is traditionally employed on well-sampled surfaces where the geometry and topology is precisely known. When the form of the surface is that of a point cloud containing nonuniform sampling, noise, and incomplete measurements, traditional shape analysis methods perform poorly. Although one may first perform reconstruction on such a point cloud prior to performing shape analysis, if the geometry and topology is far from the true surface, then this can have an adverse impact on the subsequent analysis. Furthermore, for triangulated surfaces containing noise, thin sheets, and poorly shaped triangles, existing shape analysis methods can be highly unstable. This thesis explores methods of shape analysis applied directly to such defect-laden shapes. We first study the problem of surface reconstruction, in order to obtain a better understanding of the types of point clouds for which reconstruction methods contain difficulties. To this end, we have devised a benchmark for surface reconstruction, establishing a standard for measuring error in reconstruction. We then develop a new method for consistently orienting normals of such challenging point clouds by using a collection of harmonic functions, intrinsically defined on the point cloud. Next, we develop a new shape analysis tool which is tolerant to imperfections, by constructing distances directly on the point cloud defined as the likelihood of two points belonging to a mutually common medial ball, and apply this for segmentation and reconstruction. We extend this distance measure to define a diffusion process on the point cloud, tolerant to missing data, which is used for the purposes of matching incomplete shapes undergoing a nonrigid deformation. Lastly, we have developed an intrinsic method for multiresolution remeshing of a poor-quality triangulated surface via spectral bisection
- …