99 research outputs found

    Computational Modeling of Facial Response for Detecting Differential Traits in Autism Spectrum Disorders

    Get PDF
    This dissertation proposes novel computational modeling and computer vision methods for the analysis and discovery of differential traits in subjects with Autism Spectrum Disorders (ASD) using video and three-dimensional (3D) images of face and facial expressions. ASD is a neurodevelopmental disorder that impairs an individual’s nonverbal communication skills. This work studies ASD from the pathophysiology of facial expressions which may manifest atypical responses in the face. State-of-the-art psychophysical studies mostly employ na¨ıve human raters to visually score atypical facial responses of individuals with ASD, which may be subjective, tedious, and error prone. A few quantitative studies use intrusive sensors on the face of the subjects with ASD, which in turn, may inhibit or bias the natural facial responses of these subjects. This dissertation proposes non-intrusive computer vision methods to alleviate these limitations in the investigation for differential traits from the spontaneous facial responses of individuals with ASD. Two IRB-approved psychophysical studies are performed involving two groups of age-matched subjects: one for subjects diagnosed with ASD and the other for subjects who are typically-developing (TD). The facial responses of the subjects are computed from their facial images using the proposed computational models and then statistically analyzed to infer about the differential traits for the group with ASD. A novel computational model is proposed to represent the large volume of 3D facial data in a small pose-invariant Frenet frame-based feature space. The inherent pose-invariant property of the proposed features alleviates the need for an expensive 3D face registration in the pre-processing step. The proposed modeling framework is not only computationally efficient but also offers competitive performance in 3D face and facial expression recognition tasks when compared with that of the state-ofthe-art methods. This computational model is applied in the first experiment to quantify subtle facial muscle response from the geometry of 3D facial data. Results show a statistically significant asymmetry in specific pair of facial muscle activation (p\u3c0.05) for the group with ASD, which suggests the presence of a psychophysical trait (also known as an ’oddity’) in the facial expressions. For the first time in the ASD literature, the facial action coding system (FACS) is employed to classify the spontaneous facial responses based on facial action units (FAUs). Statistical analyses reveal significantly (p\u3c0.01) higher prevalence of smile expression (FAU 12) for the ASD group when compared with the TD group. The high prevalence of smile has co-occurred with significantly averted gaze (p\u3c0.05) in the group with ASD, which is indicative of an impaired reciprocal communication. The metric associated with incongruent facial and visual responses suggests a behavioral biomarker for ASD. The second experiment shows a higher prevalence of mouth frown (FAU 15) and significantly lower correlations between the activation of several FAU pairs (p\u3c0.05) in the group with ASD when compared with the TD group. The proposed computational modeling in this dissertation offers promising biomarkers, which may aid in early detection of subtle ASD-related traits, and thus enable an effective intervention strategy in the future

    On the formulation and uses of SVD-based generalized curvatures

    Get PDF
    2016 Summer.Includes bibliographical references.In this dissertation we consider the problem of computing generalized curvature values from noisy, discrete data and applications of the provided algorithms. We first establish a connection between the Frenet-Serret Frame, typically defined on an analytical curve, and the vectors from the local Singular Value Decomposition (SVD) of a discretized time-series. Next, we expand upon this connection to relate generalized curvature values, or curvatures, to a scaled ratio of singular values. Initially, the local singular value decomposition is centered on a point of the discretized time-series. This provides for an efficient computation of curvatures when the underlying curve is known. However, when the structure of the curve is not known, for example, when noise is present in the tabulated data, we propose two modifications. The first modification computes the local singular value decomposition on the mean-centered data of a windowed selection of the time-series. We observe that the mean-center version increases the stability of the curvature estimations in the presence of signal noise. The second modification is an adaptive method for selecting the size of the window, or local ball, to use for the singular value decomposition. This allows us to use a large window size when curvatures are small, which reduces the effects of noise thanks to the use of a large number of points in the SVD, and to use a small window size when curvatures are large, thereby best capturing the local curvature. Overall we observe that adapting the window size to the data, enhances the estimates of generalized curvatures. The combination of these two modifications produces a tool for computing generalized curvatures with reasonable precision and accuracy. Finally, we compare our algorithm, with and without modifications, to existing numerical curvature techniques on different types of data such as that from the Microsoft Kinect 2 sensor. To address the topic of action segmentation and recognition, a popular topic within the field of computer vision, we created a new dataset from this sensor showcasing a pose space skeletonized representation of individuals performing continuous human actions as defined by the MSRC-12 challenge. When this data is optimally projected onto a low-dimensional space, we observed each human motion lies on a distinguished line, plane, hyperplane, etc. During transitions between motions, either the dimension of the optimal subspace significantly, or the trajectory of the curve through pose space nearly reverses. We use our methods of computing generalized curvature values to identify these locations, categorized as either high curvatures or changing curvatures. The geometric characterization of the time-series allows us to segment individual,or geometrically distinct, motions. Finally, using these segments, we construct a methodology for selecting motions to conjoin for the task of action classification

    Computer-aided Visualization of Colonoscopy

    Get PDF
    Colonoscopy is the most widely used medical technique to examine the human large intestine (colon) and eliminate precancerous or malignant lesions, i.e., polyps. It uses a high-definition camera to examine the inner surface of the colon. Very often, a portion of the colon surface is not visualized during the procedure. Unsurveyed portions of the colon can harbor polyps that then progress to colorectal cancer. Unfortunately, it is hard for the endoscopist to realize there is unsurveyed surface from the video as it is formed. A system to alert endoscopists to missed surface area could thus more fully protect patients from colorectal cancer following colonoscopy. In this dissertation computer-aided visualization techniques were developed in order to solve this problem:1. A novel Simultaneous Localization and Mapping (SLAM) algorithm called RNNSLAM was proposed to address the difficulties of applying a traditional SLAM system on colonic images. I improved a standard SLAM system with a previously proposed Recurrent Neural Network for Depth and Pose Estimation (RNN-DP). The combination of SLAM’s optimization mechanism and RNN-DP’s prior knowledge achieved state-of-the-art performance on colonoscopy, especially addressing the drift problem in both SLAM and RNN-DP. A fusion module was added to this system to generate a dense 3D surface.2. I conducted exploration research on recognizing colonic places that have been visited based on video frames. This technique called image relocalization or retrieval is needed for helping the endoscopist to fully survey the previously unsurveyed regions. A benchmark testing dataset was created for colon image retrieval. Deep neural networks were successfully trained using Structure from Motion results on colonoscopy and achieved promising results.3. To visualize highly-curved portions of a colon or the whole colon, a generalized cylinder deformation algorithm was proposed to semi-flatten the geometry of the colon model for more succinct and global visualization.Doctor of Philosoph

    To Draw or Not to Draw: Recognizing Stroke-Hover Intent in Gesture-Free Bare-Hand Mid-Air Drawing Tasks

    Get PDF
    Over the past several decades, technological advancements have introduced new modes of communication with the computers, introducing a shift from traditional mouse and keyboard interfaces. While touch based interactions are abundantly being used today, latest developments in computer vision, body tracking stereo cameras, and augmented and virtual reality have now enabled communicating with the computers using spatial input in the physical 3D space. These techniques are now being integrated into several design critical tasks like sketching, modeling, etc. through sophisticated methodologies and use of specialized instrumented devices. One of the prime challenges in design research is to make this spatial interaction with the computer as intuitive as possible for the users. Drawing curves in mid-air with fingers, is a fundamental task with applications to 3D sketching, geometric modeling, handwriting recognition, and authentication. Sketching in general, is a crucial mode for effective idea communication between designers. Mid-air curve input is typically accomplished through instrumented controllers, specific hand postures, or pre-defined hand gestures, in presence of depth and motion sensing cameras. The user may use any of these modalities to express the intention to start or stop sketching. However, apart from suffering with issues like lack of robustness, the use of such gestures, specific postures, or the necessity of instrumented controllers for design specific tasks further result in an additional cognitive load on the user. To address the problems associated with different mid-air curve input modalities, the presented research discusses the design, development, and evaluation of data driven models for intent recognition in non-instrumented, gesture-free, bare-hand mid-air drawing tasks. The research is motivated by a behavioral study that demonstrates the need for such an approach due to the lack of robustness and intuitiveness while using hand postures and instrumented devices. The main objective is to study how users move during mid-air sketching, develop qualitative insights regarding such movements, and consequently implement a computational approach to determine when the user intends to draw in mid-air without the use of an explicit mechanism (such as an instrumented controller or a specified hand-posture). By recording the user’s hand trajectory, the idea is to simply classify this point as either hover or stroke. The resulting model allows for the classification of points on the user’s spatial trajectory. Drawing inspiration from the way users sketch in mid-air, this research first specifies the necessity for an alternate approach for processing bare hand mid-air curves in a continuous fashion. Further, this research presents a novel drawing intent recognition work flow for every recorded drawing point, using three different approaches. We begin with recording mid-air drawing data and developing a classification model based on the extracted geometric properties of the recorded data. The main goal behind developing this model is to identify drawing intent from critical geometric and temporal features. In the second approach, we explore the variations in prediction quality of the model by improving the dimensionality of data used as mid-air curve input. Finally, in the third approach, we seek to understand the drawing intention from mid-air curves using sophisticated dimensionality reduction neural networks such as autoencoders. Finally, the broad level implications of this research are discussed, with potential development areas in the design and research of mid-air interactions

    Application de la symétrie de jauge et de la théorie des solitons aux protéines repliées

    Get PDF
    Le but de cette thèse est d étudier profondément le repliement des protéines, au moyendes concepts d invariance de jauge et d universalité. La structure de jauge émerge del équation de Frenet qui est utilisée pour décrire la forme de la chaîne principale de laprotéine. Le principe d invariance de jauge conduit à une fonctionnelle d énergieeffective pour une protéine, développée dans le but d extraire les propriétésuniverselles des protéines repliées durant la phase d effondrement, et qui estcaractérisée par la loi d échelle du rayon de giration au niveau tertiaire de la structureprotéique. Dans cette thèse, on étudie l existence d une large universalité au niveausecondaire de la structure protéique. La fonctionnelle d énergie invariante de jaugealliée à l équation de Frenet discrète conduit à une solution solitonique, identifiéecomme un motif hélice-boucle-hélice dans la protéine.The purpose of this thesis is to investigate protein folding, by means of the general concepts of gauge invariance and universality. The gauge structure emerges in the Frenet equation which is utilized to describe the shape of protein backbone. The gauge invariance principle leads us an effective energy functional for a protein, which bas been found to catch the universal properties of folded proteins in their collapse phase,characterized by the scaling law of gyration radius on the tertiary level of proteinstructure. In this thesis, the existence of wide universality on the secondary level of protein structure is investigated. The synthesis of the gauge-invariant energy functional with the discrete Frenet equation leads to a soliton solution, which is identified as the helix-loop-helix motif in protein.TOURS-Bibl.électronique (372610011) / SudocSudocFranceF

    Deformation Based Curved Shape Representation

    Get PDF
    Representation and modelling of an objects' shape is critical in object recognition, synthesis, tracking and many other applications in computer vision. As a result, there is a wide range of approaches in formulating representation space and quantifying the notion of similarity between shapes. A similarity metric between shapes is a basic building block in modelling shape categories, optimizing shape valued functionals, and designing a classifier. Consequently, any subsequent shape based computation is fundamentally dependent on the computational efficiency, robustness, and invariance to shape preserving transformations of the defined similarity metric. In this thesis, we propose a novel finite dimensional shape representation framework that leads to a computationally efficient, closed form solution, and noise tolerant similarity distance function. Several important characteristics of the proposed curved shape representation approach are discussed in relation to earlier works. Subsequently, two different solutions are proposed for optimal parameter estimation of curved shapes. Hence, providing two possible solutions for the point correspondence estimation problem between two curved shapes. Later in the thesis, we show that several statistical models can readily be adapted to the proposed shape representation framework for object category modelling. The thesis finalizes by exploring potential applications of the proposed curved shape representation in 3D facial surface and facial expression representation and modelling

    Intrinsic dimensionality in vision: Nonlinear filter design and applications

    Get PDF
    Biological vision and computer vision cannot be treated independently anymore. The digital revolution and the emergence of more and more sophisticated technical applications caused a symbiosis between the two communities. Competitive technical devices challenging the human performance rely increasingly on algorithms motivated by the human vision system. On the other hand, computational methods can be used to gain a richer understanding of neural behavior, e.g. the behavior of populations of multiple processing units. The relations between computational approaches and biological findings range from low level vision to cortical areas being responsible for higher cognitive abilities. In early stages of the visual cortex cells have been recorded which could not be explained by the standard approach of orientation- and frequency-selective linear filters anymore. These cells did not respond to straight lines or simple gratings but they fired whenever a more complicated stimulus, like a corner or an end-stopped line, was presented within the receptive field. Using the concept of intrinsic dimensionality, these cells can be classified as intrinsic-two-dimensional systems. The intrinsic dimensionality determines the number of degrees of freedom in the domain which is required to completely determine a signal. A constant image has dimension zero, straight lines and trigonometric functions in one direction have dimension one, and the remaining signals, which require the full number of degrees of freedom, have the dimension two. In this term the reported cells respond to two dimensional signals only. Motivated by the classical approach, which can be realized by orientation- and frequency-selective Gabor-filter functions, a generalized Gabor framework is developed in the context of second-order Volterra systems. The generalized Gabor approach is then used to design intrinsic two-dimensional systems which have the same selectivity properties like the reported cells in early visual cortex. Numerical cognition is commonly assumed to be a higher cognitive ability of humans. The estimation of the number of things from the environment requires a high degree of abstraction. Several studies showed that humans and other species have access to this abstract information. But it is still unclear how this information can be extracted by neural hardware. If one wants to deal with this issue, one has to think about the immense invariance property of number. One can apply a high number of operations to objects which do not change its number. In this work, this problem is considered from a topological perspective. Well known relations between differential geometry and topology are used to develop a computational model. Surprisingly, the resulting operators providing the features which are integrated in the system are intrinsic-two-dimensional operators. This model is used to conduct standard number estimation experiments. The results are then compared to reported human behavior. The last topic of this work is active object recognition. The ability to move the information gathering device, like humans can move their eyes, provides the opportunity to choose the next action. Studies of human saccade behavior suggest that this is not done in a random manner. In order to decrease the time an active object recognition system needs to reach a certain level of performance, several action selection strategies are investigated. The strategies considered within this work are based on information theoretical and probabilistic concepts. These strategies are finally compared to a strategy based on an intrinsic-two-dimensional operator. All three topics are investigated with respect to their relation to the concept of intrinsic dimensionality from a mathematical point of view

    3D corrective nose reconstruction from a single image

    Get PDF
    There is a steadily growing range of applications that can benefit from facial reconstruction techniques, leading to an increasing demand for reconstruction of high-quality 3D face models. While it is an important expressive part of the human face, the nose has received less attention than other expressive regions in the face reconstruction literature. When applying existing reconstruction methods to facial images, the reconstructed nose models are often inconsistent with the desired shape and expression. In this paper, we propose a coarse-to-fine 3D nose reconstruction and correction pipeline to build a nose model from a single image, where 3D and 2D nose curve correspondences are adaptively updated and refined. We first correct the reconstruction result coarsely using constraints of 3D-2D sparse landmark correspondences, and then heuristically update a dense 3D-2D curve correspondence based on the coarsely corrected result. A final refinement step is performed to correct the shape based on the updated 3D-2D dense curve constraints. Experimental results show the advantages of our method for 3D nose reconstruction over existing methods

    Matching hierarchical structures for shape recognition

    Get PDF
    In this thesis we aim to develop a framework for clustering trees and rep- resenting and learning a generative model of graph structures from a set of training samples. The approach is applied to the problem of the recognition and classification of shape abstracted in terms of its morphological skeleton. We make five contributions. The first is an algorithm to approximate tree edit-distance using relaxation labeling. The second is the introduction of the tree union, a representation capable of representing the modes of structural variation present in a set of trees. The third is an information theoretic approach to learning a generative model of tree structures from a training set. While the skeletal abstraction of shape was chosen mainly as a exper- imental vehicle, we, nonetheless, make some contributions to the fields of skeleton extraction and its graph representation. In particular, our fourth contribution is the development of a skeletonization method that corrects curvature effects in the Hamilton-Jacobi framework, improving its localiza- tion and noise sensitivity. Finally, we propose a shape-measure capable of characterizing shapes abstracted in terms of their skeleton. This measure has a number of interesting properties. In particular, it varies smoothly as the shape is deformed and can be easily computed using the presented skeleton extraction algorithm. Each chapter presents an experimental analysis of the proposed approaches applied to shape recognition problems
    • …
    corecore