1,246 research outputs found

    Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion

    Full text link

    Selecting a Small Set of Optimal Gestures from an Extensive Lexicon

    Full text link
    Finding the best set of gestures to use for a given computer recognition problem is an essential part of optimizing the recognition performance while being mindful to those who may articulate the gestures. An objective function, called the ellipsoidal distance ratio metric (EDRM), for determining the best gestures from a larger lexicon library is presented, along with a numerical method for incorporating subjective preferences. In particular, we demonstrate an efficient algorithm that chooses the best nn gestures from a lexicon of mm gestures where typically nmn \ll m using a weighting of both subjective and objective measures.Comment: 27 pages, 7 figure

    A machine learning approach to statistical shape models with applications to medical image analysis

    Get PDF
    Statistical shape models have become an indispensable tool for image analysis. The use of shape models is especially popular in computer vision and medical image analysis, where they were incorporated as a prior into a wide range of different algorithms. In spite of their big success, the study of statistical shape models has not received much attention in recent years. Shape models are often seen as an isolated technique, which merely consists of applying Principal Component Analysis to a set of example data sets. In this thesis we revisit statistical shape models and discuss their construction and applications from the perspective of machine learning and kernel methods. The shapes that belong to an object class are modeled as a Gaussian Process whose parameters are estimated from example data. This formulation puts statistical shape models in a much wider context and makes the powerful inference tools from learning theory applicable to shape modeling. Furthermore, the formulation is continuous and thus helps to avoid discretization issues, which often arise with discrete models. An important step in building statistical shape models is to establish surface correspondence. We discuss an approach which is based on kernel methods. This formulation allows us to integrate the statistical shape model as an additional prior. It thus unifies the methods of registration and shape model fitting. Using Gaussian Process regression we can integrate shape constraints in our model. These constraints can be used to enforce landmark matching in the fitting or correspondence problem. The same technique also leads directly to a new solution for shape reconstruction from partial data. In addition to experiments on synthetic 2D data sets, we show the applicability of our methods on real 3D medical data of the human head. In particular, we build a 3D model of the human skull, and present its applications for the planning of cranio-facial surgeries

    Reconstruction of three-dimensional facial geometric features related to fetal alcohol syndrome using adult surrogates

    Get PDF
    Fetal alcohol syndrome (FAS) is a condition caused by prenatal alcohol exposure. The diagnosis of FAS is based on the presence of central nervous system impairments, evidence of growth abnormalities and abnormal facial features. Direct anthropometry has traditionally been used to obtain facial data to assess the FAS facial features. Research efforts have focused on indirect anthropometry such as 3D surface imaging systems to collect facial data for facial analysis. However, 3D surface imaging systems are costly. As an alternative, approaches for 3D reconstruction from a single 2D image of the face using a 3D morphable model (3DMM) were explored in this research study. The research project was accomplished in several steps. 3D facial data were obtained from the publicly available BU-3DFE database, developed by the State University of New York. The 3D face scans in the training set were landmarked by different observers. The reliability and precision in selecting 3D landmarks were evaluated. The intraclass correlation coefficients for intra- and inter-observer reliability were greater than 0.95. The average intra-observer error was 0.26 mm and the average inter-observer error was 0.89 mm. A rigid registration was performed on the 3D face scans in the training set. Following rigid registration, a dense point-to-point correspondence across a set of aligned face scans was computed using the Gaussian process model fitting approach. A 3DMM of the face was constructed from the fully registered 3D face scans. The constructed 3DMM of the face was evaluated based on generalization, specificity, and compactness. The quantitative evaluations show that the constructed 3DMM achieves reliable results. 3D face reconstructions from single 2D images were estimated based on the 3DMM. The MetropolisHastings algorithm was used to fit the 3DMM features to 2D image features to generate the 3D face reconstruction. Finally, the geometric accuracy of the reconstructed 3D faces was evaluated based on ground-truth 3D face scans. The average root mean square error for the surface-to-surface comparisons between the reconstructed faces and the ground-truth face scans was 2.99 mm. In conclusion, a framework to estimate 3D face reconstructions from single 2D facial images was developed and the reconstruction errors were evaluated. The geometric accuracy of the 3D face reconstructions was comparable to that found in the literature. However, future work should consider minimizing reconstruction errors to acceptable clinical standards in order for the framework to be useful for 3D-from-2D reconstruction in general, and also for developing FAS applications. Finally, future work should consider estimating a 3D face using multi-view 2D images to increase the information available for 3D-from-2D reconstruction

    Biometrics

    Get PDF
    Biometrics-Unique and Diverse Applications in Nature, Science, and Technology provides a unique sampling of the diverse ways in which biometrics is integrated into our lives and our technology. From time immemorial, we as humans have been intrigued by, perplexed by, and entertained by observing and analyzing ourselves and the natural world around us. Science and technology have evolved to a point where we can empirically record a measure of a biological or behavioral feature and use it for recognizing patterns, trends, and or discrete phenomena, such as individuals' and this is what biometrics is all about. Understanding some of the ways in which we use biometrics and for what specific purposes is what this book is all about

    Convex modeling with priors

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2006.Includes bibliographical references (leaves 159-169).As the study of complex interconnected networks becomes widespread across disciplines, modeling the large-scale behavior of these systems becomes both increasingly important and increasingly difficult. In particular, it is of tantamount importance to utilize available prior information about the system's structure when building data-driven models of complex behavior. This thesis provides a framework for building models that incorporate domain specific knowledge and glean information from unlabeled data points. I present a methodology to augment standard methods in statistical regression with priors. These priors might include how the output series should behave or the specifics of the functional form relating inputs to outputs. My approach is optimization driven: by formulating a concise set of goals and constraints, approximate models may be systematically derived. The resulting approximations are convex and thus have only global minima and can be solved efficiently. The functional relationships amongst data are given as sums of nonlinear kernels that are expressive enough to approximate any mapping. Depending on the specifics of the prior, different estimation algorithms can be derived, and relationships between various types of data can be discovered using surprisingly few examples.(cont.) The utility of this approach is demonstrated through three exemplary embodiments. When the output is constrained to be discrete, a powerful set of algorithms for semi-supervised classification and segmentation result. When the output is constrained to follow Markovian dynamics, techniques for nonlinear dimensionality reduction and system identification are derived. Finally, when the output is constrained to be zero on a given set and non-zero everywhere else, a new algorithm for learning latent constraints in high-dimensional data is recovered. I apply the algorithms derived from this framework to a varied set of domains. The dissertation provides a new interpretation of the so-called Spectral Clustering algorithms for data segmentation and suggests how they may be improved. I demonstrate the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. Lastly, I discuss empirical methods to detect conserved quantities and learn constraints defining data sets.by Benjamin Recht.Ph.D

    Extending procrustes analysis : building multi-view 2-D models from 3-D human shape samples

    Get PDF
    This dissertation formalizes the construction of multi-view 2D shape models from 3D data. We propose several extensions of the well-known Procrustes Analysis (PA) algorithm that allow modeling rigid and non-rigid transformations in an efficient manner. The proposed strategies are successfully tested on faces and human bodies datasets. In human perception applications one can set physical restrictions, such as defining faces and human skeletons as sets of anatomical landmarks or articulated bodies. However, the high variation of facial expressions and human postures from different viewpoints makes problems like face tracking or human pose estimation extremely challenging. The common approach to handle large viewpoint variations is training the models with several labeled images from different viewpoints. However, this approach has several important drawbacks: (1) it is not clear how much it is necessary to enhance the dataset with images from different viewpoints in order to build unbiased 2D models; (2) extending the training set without this evaluation would unnecessarily increase memory and computation requirements to train the models; and (3) obtaining new labeled images from different viewpoints can be a difficult task because of the expensive labeling cost; finally, (4) a non-uniform coverage of the different viewpoints of a person leads to biased 2D models. In this dissertation we propose successive extensions of PA to address these issues. First of all, we introduce Projected Procrustes Analysis (PPA) as a formalization for building multi-view 2D rigid models from 3D datasets. PPA rotates and projects every 3D training shape and builds a multi-view 2D model from this enhanced training set. We also introduce common parameterizations of rotations, as well as mechanisms to uniformly sample the rotation space. We show that uniformly distributed rotations generate unbiased 2D models, while non-uniform rotations lead to models representing some viewpoints better than others. Although PPA has been successful in building multi-view 2D models, it requires an enhanced dataset that increases the computational requirements in space and time. In order to address these PA and PPA drawbacks, we propose Continuous Procrustes Analysis (CPA). CPA extends PPA within a functional analysis framework and constructs multi-view 2D rigid models in an efficient way through integrating all possible rotations in a given domain. We show that CPA models are inherently unbiased because of their integral formulation. However, CPA is not able to capture non-rigid deformations from the dataset. Next, in order to efficiently compute multi-view 2D deformable models from 3D data, we introduce Subspace Procrustes Analysis (SPA). By adding a subspace in the PA formulation, SPA is able to model non-rigid deformations, as well as rigid 3D transformations of the training set. We developed a discrete (DSPA) and continuous (CSPA) formulation to provide a better understanding of the problem, where DSPA samples and CSPA integrates the 3D rotation space. Finally, we illustrate the benefits of our multi-view 2D deformable models in the task of human pose estimation. We first reformulate the problem as feature selection by subspace matching, and propose an efficient approach for this task. Our method is much more efficient than the state-of-the-art feature selection by subspace matching approaches, and it is able to handle larger number of outliers. Next, we show that our multi-view 2D deformable models, combined with the subspace matching method, outperform state-of-the-art methods of human pose estimation. Our approach is more accurate in the joint positions and limb lengths because we use unbiased 2D models trained on 3D Motion Capture datasets. Our models are not biased to any particular point of view and they can successfully reconstruct different non-rigid deformations and viewpoints. Moreover, they are efficient in both learning and test times.En esta tesis se formaliza la construcción de modelos multivista 2D a partir de datos 3D, a través de varias extensiones del conocido método Procrustes Analysis (PA). Las extensiones propuestas permiten modelar transformaciones rígidas y no rígidas eficientemente, y se han puesto a prueba en bases de datos de caras y cuerpos humanos. Las aplicaciones donde se perciben humanos permiten establecer restricciones físicas, tales como definir caras y esqueletos como conjuntos de puntos anatómicos. Sin embargo, la gran variación que sufren las expresiones faciales y las posturas humanas desde distintos puntos de vista convierten problemas como el seguimiento de caras o la estimación de la postura humana en retos extremadamente complejos. El planteamiento habitual para gestionar grandes variaciones de punto de vista consiste en entrenar los modelos con imágenes etiquetadas tomadas con distintas orientaciones. Sin embargo, este enfoque sufre importantes inconvenientes: (1) no queda claro cuántas imágenes adicionales con distintas orientaciones son necesarias con tal de construir modelos 2D no sesgados por ningún punto de vista; (2) extender el conjunto de datos de entrenamiento sin esta evaluación incrementaría innecesariamente el coste computacional en tiempo y en memoria; (3) obtener nuevas imágenes etiquetadas con distintas orientaciones puede tratarse de una tarea compleja debido al elevado coste del etiquetado manual; finalmente, (4) no cubrir uniformemente los distintos puntos de vista de una persona conduce a modelos sesgados. En esta tesis se proponen sucesivas extensiones de PA para hacer frente a estos problemas. Primero, proponemos Projected Procrustes Analysis (PPA) para formalizar la construcción de modelos rígidos multivista 2D a partir de conjuntos de datos 3D. PPA rota y proyecta cada objeto 3D y construye un modelo 2D a partir de este conjunto de datos enriquecido. También mostramos como rotaciones uniformemente distribuidas generan modelos 2D no sesgados, mientras rotaciones no uniformes conducen a modelos que representan algunos puntos de vista mejor que otros. Aunque PPA construye modelos multivista 2D, necesita un conjunto de entrenamiento enriquecido que incrementa los requisitos computacionales. Para solventar este problema de PA y PPA, proponemos Continuous Procrustes Analysis (CPA). CPA extiende PPA en un marco de análisis funcional y construye modelos rígidos multivista 2D de un modo eficiente, integrando todas las posibles rotaciones en un dominio dado. Mostramos como los modelos generados con CPA son inherentemente no sesgados debido a la formulación integral. Sin embargo, CPA no captura las deformaciones no rígidas de los datos. En consecuencia, proponemos Subspace Procrustes Analysis (SPA) con el objetivo de construir modelos deformables multivista 2D de un modo eficiente a partir de datos 3D. Añadiendo un subespacio a la formulación de PA, SPA es capaz de modelar deformaciones no rígidas, así como transformaciones 3D de los datos. Desarrollamos una formulación discreta (DSPA) y otra continua (CSPA), donde DSPA muestrea y CSPA integra el espacio de rotaciones 3D. Finalmente, ilustramos las ventajas de nuestros modelos deformables multivista 2D en la tarea de estimar la postura humana. Primero reformulamos el problema como una selección de características por subespacio coincidente y proponemos un método para resolver esta tarea eficientemente. Después, mostramos como nuestros modelos multivista 2D, combinados con la selección de características por subespacio coincidente, mejoran el estado del arte de estimación de la pose humana. Nuestro método es más preciso en la posición de las articulaciones y la longitud de las extremidades por el uso de modelos multivista 2D entrenados en bases de datos de captura de movimiento 3D. Nuestros modelos no están sesgados por punto de vista y pueden reconstruir deformaciones rígidas y no rígidas. Además, estos modelos son eficientes tanto en su construcción como en su us

    Data-driven shape analysis and processing

    Get PDF
    Data-driven methods serve an increasingly important role in discovering geometric, structural, and semantic relationships between shapes. In contrast to traditional approaches that process shapes in isolation of each other, data-driven methods aggregate information from 3D model collections to improve the analysis, modeling and editing of shapes. Through reviewing the literature, we provide an overview of the main concepts and components of these methods, as well as discuss their application to classification, segmentation, matching, reconstruction, modeling and exploration, as well as scene analysis and synthesis. We conclude our report with ideas that can inspire future research in data-driven shape analysis and processing
    corecore