671 research outputs found

    Grassmann Learning for Recognition and Classification

    Get PDF
    Computational performance associated with high-dimensional data is a common challenge for real-world classification and recognition systems. Subspace learning has received considerable attention as a means of finding an efficient low-dimensional representation that leads to better classification and efficient processing. A Grassmann manifold is a space that promotes smooth surfaces, where points represent subspaces and the relationship between points is defined by a mapping of an orthogonal matrix. Grassmann learning involves embedding high dimensional subspaces and kernelizing the embedding onto a projection space where distance computations can be effectively performed. In this dissertation, Grassmann learning and its benefits towards action classification and face recognition in terms of accuracy and performance are investigated and evaluated. Grassmannian Sparse Representation (GSR) and Grassmannian Spectral Regression (GRASP) are proposed as Grassmann inspired subspace learning algorithms. GSR is a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss §¤1-norm minimization for improved classification. GRASP is a novel subspace learning algorithm that leverages the benefits of Grassmann manifolds and Spectral Regression in a framework that supports high discrimination between classes and achieves computational benefits by using manifold modeling and avoiding eigen-decomposition. The effectiveness of GSR and GRASP is demonstrated for computationally intensive classification problems: (a) multi-view action classification using the IXMAS Multi-View dataset, the i3DPost Multi-View dataset, and the WVU Multi-View dataset, (b) 3D action classification using the MSRAction3D dataset and MSRGesture3D dataset, and (c) face recognition using the ATT Face Database, Labeled Faces in the Wild (LFW), and the Extended Yale Face Database B (YALE). Additional contributions include the definition of Motion History Surfaces (MHS) and Motion Depth Surfaces (MDS) as descriptors suitable for activity representations in video sequences and 3D depth sequences. An in-depth analysis of Grassmann metrics is applied on high dimensional data with different levels of noise and data distributions which reveals that standardized Grassmann kernels are favorable over geodesic metrics on a Grassmann manifold. Finally, an extensive performance analysis is made that supports Grassmann subspace learning as an effective approach for classification and recognition

    Support matrix machine: A review

    Full text link
    Support vector machine (SVM) is one of the most studied paradigms in the realm of machine learning for classification and regression problems. It relies on vectorized input data. However, a significant portion of the real-world data exists in matrix format, which is given as input to SVM by reshaping the matrices into vectors. The process of reshaping disrupts the spatial correlations inherent in the matrix data. Also, converting matrices into vectors results in input data with a high dimensionality, which introduces significant computational complexity. To overcome these issues in classifying matrix input data, support matrix machine (SMM) is proposed. It represents one of the emerging methodologies tailored for handling matrix input data. The SMM method preserves the structural information of the matrix data by using the spectral elastic net property which is a combination of the nuclear norm and Frobenius norm. This article provides the first in-depth analysis of the development of the SMM model, which can be used as a thorough summary by both novices and experts. We discuss numerous SMM variants, such as robust, sparse, class imbalance, and multi-class classification models. We also analyze the applications of the SMM model and conclude the article by outlining potential future research avenues and possibilities that may motivate academics to advance the SMM algorithm

    A New Sparse Representation Algorithm for 3D Human Pose Estimation

    Get PDF
    This paper addresses the problem of recovering 3D human pose from single 2D images using Sparse Representation. While recent Sparse Representation (SR) based 3D human pose estimation methods have attained promising results estimating human poses from single images, their performance depends on the availability of large labeled datasets. However, in many real world applications, accessing to sufficient labeled data may be expensive and/or time consuming, but it is relatively easy to acquire a large amount of unlabeled data. Moreover, all SR based 3D pose estimation methods only consider the information of the input feature space and they cannot utilize the information of the pose space. In this paper, we propose a new framework based on sparse representation for 3D human pose estimation which uses both the labeled and unlabeled data. Furthermore, the proposed method can exploit the information of the pose space to improve the pose estimation accuracy. Experimental results show that the performance of the proposed method is significantly better than the state of the art 3D human pose estimation methods


    Get PDF
    We present four contributions to visual surveillance: (a) an action recognition method based on the characteristics of human motion in image space; (b) a study of the strengths of five regression techniques for monocular pose estimation that highlights the advantages of kernel PLS; (c) a learning-based method for detecting objects carried by humans requiring minimal annotation; (d) an interactive video segmentation system that reduces supervision by using occlusion and long term spatio-temporal structure information. We propose a representation for human actions that is based solely on motion information and that leverages the characteristics of human movement in the image space. The representation is best suited to visual surveillance settings in which the actions of interest are highly constrained, but also works on more general problems if the actions are ballistic in nature. Our computationally efficient representation achieves good recognition performance on both a commonly used action recognition dataset and on a dataset we collected to simulate a checkout counter. We study discriminative methods for 3D human pose estimation from single images, which build a map from image features to pose. The main difficulty with these methods is the insufficiency of training data due to the high dimensionality of the pose space. However, real datasets can be augmented with data from character animation software, so the scalability of existing approaches becomes important. We argue that Kernel Partial Least Squares approximates Gaussian Process regression robustly, enabling the use of larger datasets, and we show in experiments that kPLS outperforms two state-of-the-art methods based on GP. The high variability in the appearance of carried objects suggests using their relation to the human silhouette to detect them. We adopt a generate-and-test approach that produces candidate regions from protrusion, color contrast and occlusion boundary cues and then filters them with a kernel SVM classifier on context features. Our method exceeds state of the art accuracy and has good generalization capability. We also propose a Multiple Instance Learning framework for the classifier that reduces annotation effort by two orders of magnitude while maintaining comparable accuracy. Finally, we present an interactive video segmentation system that trades off a small amount of segmentation quality for significantly less supervision than necessary in systems in the literature. While applications like video editing could not directly use the output of our system, reasoning about the trajectories of objects in a scene or learning coarse appearance models is still possible. The unsupervised segmentation component at the base of our system effectively employs occlusion boundary cues and achieves competitive results on an unsupervised segmentation dataset. On videos used to evaluate interactive methods, our system requires less interaction time than others, does not rely on appearance information and can extract multiple objects at the same time

    Investigating Brain Functional Networks in a Riemannian Framework

    Get PDF
    The brain is a complex system of several interconnected components which can be categorized at different Spatio-temporal levels, evaluate the physical connections and the corresponding functionalities. To study brain connectivity at the macroscale, Magnetic Resonance Imaging (MRI) technique in all the different modalities has been exemplified to be an important tool. In particular, functional MRI (fMRI) enables to record the brain activity either at rest or in different conditions of cognitive task and assist in mapping the functional connectivity of the brain. The information of brain functional connectivity extracted from fMRI images can be defined using a graph representation, i.e. a mathematical object consisting of nodes, the brain regions, and edges, the link between regions. With this representation, novel insights have emerged about understanding brain connectivity and providing evidence that the brain networks are not randomly linked. Indeed, the brain network represents a small-world structure, with several different properties of segregation and integration that are accountable for specific functions and mental conditions. Moreover, network analysis enables to recognize and analyze patterns of brain functional connectivity characterizing a group of subjects. In recent decades, many developments have been made to understand the functioning of the human brain and many issues, related to the biological and the methodological perspective, are still need to be addressed. For example, sub-modular brain organization is still under debate, since it is necessary to understand how the brain is functionally organized. At the same time a comprehensive organization of functional connectivity is mostly unknown and also the dynamical reorganization of functional connectivity is appearing as a new frontier for analyzing brain dynamics. Moreover, the recognition of functional connectivity patterns in patients affected by mental disorders is still a challenging task, making plausible the development of new tools to solve them. Indeed, in this dissertation, we proposed novel methodological approaches to answer some of these biological and neuroscientific questions. We have investigated methods for analyzing and detecting heritability in twin's task-induced functional connectivity profiles. in this approach we are proposing a geodesic metric-based method for the estimation of similarity between functional connectivity, taking into account the manifold related properties of symmetric and positive definite matrices. Moreover, we also proposed a computational framework for classification and discrimination of brain connectivity graphs between healthy and pathological subjects affected by mental disorder, using geodesic metric-based clustering of brain graphs on manifold space. Within the same framework, we also propose an approach based on the dictionary learning method to encode the high dimensional connectivity data into a vectorial representation which is useful for classification and determining regions of brain graphs responsible for this segregation. We also propose an effective way to analyze the dynamical functional connectivity, building a similarity representation of fMRI dynamic functional connectivity states, exploiting modular properties of graph laplacians, geodesic clustering, and manifold learning


    Get PDF

    Empowering engineering with data, machine learning and artificial intelligence: a short introductive review

    Get PDF
    Simulation-based engineering has been a major protagonist of the technology of the last century. However, models based on well established physics fail sometimes to describe the observed reality. They often exhibit noticeable differences between physics-based model predictions and measurements. This difference is due to several reasons: practical (uncertainty and variability of the parameters involved in the models) and epistemic (the models themselves are in many cases a crude approximation of a rich reality). On the other side, approaching the reality from experimental data represents a valuable approach because of its generality. However, this approach embraces many difficulties: model and experimental variability; the need of a large number of measurements to accurately represent rich solutions (extremely nonlinear or fluctuating), the associate cost and technical difficulties to perform them; and finally, the difficulty to explain and certify, both constituting key aspects in most engineering applications. This work overviews some of the most remarkable progress in the field in recent years

    Model Order Reduction

    Get PDF
    An increasing complexity of models used to predict real-world systems leads to the need for algorithms to replace complex models with far simpler ones, while preserving the accuracy of the predictions. This three-volume handbook covers methods as well as applications. This third volume focuses on applications in engineering, biomedical engineering, computational physics and computer science
    • …