1,314 research outputs found

    Expanding the Family of Grassmannian Kernels: An Embedding Perspective

    Full text link
    Modeling videos and image-sets as linear subspaces has proven beneficial for many visual recognition tasks. However, it also incurs challenges arising from the fact that linear subspaces do not obey Euclidean geometry, but lie on a special type of Riemannian manifolds known as Grassmannian. To leverage the techniques developed for Euclidean spaces (e.g, support vector machines) with subspaces, several recent studies have proposed to embed the Grassmannian into a Hilbert space by making use of a positive definite kernel. Unfortunately, only two Grassmannian kernels are known, none of which -as we will show- is universal, which limits their ability to approximate a target function arbitrarily well. Here, we introduce several positive definite Grassmannian kernels, including universal ones, and demonstrate their superiority over previously-known kernels in various tasks, such as classification, clustering, sparse coding and hashing

    Recognition of generalized network matrices

    Full text link
    In this PhD thesis, we deal with binet matrices, an extension of network matrices. The main result of this thesis is the following. A rational matrix A of size n times m can be tested for being binet in time O(n^6 m). If A is binet, our algorithm outputs a nonsingular matrix B and a matrix N such that [B N] is the node-edge incidence matrix of a bidirected graph (of full row rank) and A=B^{-1} N. Furthermore, we provide some results about Camion bases. For a matrix M of size n times m', we present a new characterization of Camion bases of M, whenever M is the node-edge incidence matrix of a connected digraph (with one row removed). Then, a general characterization of Camion bases as well as a recognition procedure which runs in O(n^2m') are given. An algorithm which finds a Camion basis is also presented. For totally unimodular matrices, it is proven to run in time O((nm)^2) where m=m'-n. The last result concerns specific network matrices. We give a characterization of nonnegative {r,s}-noncorelated network matrices, where r and s are two given row indexes. It also results a polynomial recognition algorithm for these matrices.Comment: 183 page

    Learning human actions by combining global dynamics and local appearance

    Get PDF
    In this paper, we address the problem of human action recognition through combining global temporal dynamics and local visual spatio-temporal appearance features. For this purpose, in the global temporal dimension, we propose to model the motion dynamics with robust linear dynamical systems (LDSs) and use the model parameters as motion descriptors. Since LDSs live in a non-Euclidean space and the descriptors are in non-vector form, we propose a shift invariant subspace angles based distance to measure the similarity between LDSs. In the local visual dimension, we construct curved spatio-temporal cuboids along the trajectories of densely sampled feature points and describe them using histograms of oriented gradients (HOG). The distance between motion sequences is computed with the Chi-Squared histogram distance in the bag-of-words framework. Finally we perform classification using the maximum margin distance learning method by combining the global dynamic distances and the local visual distances. We evaluate our approach for action recognition on five short clips data sets, namely Weizmann, KTH, UCF sports, Hollywood2 and UCF50, as well as three long continuous data sets, namely VIRAT, ADL and CRIM13. We show competitive results as compared with current state-of-the-art methods

    Human Interaction Recognition with Audio and Visual Cues

    Get PDF
    The automated recognition of human activities from video is a fundamental problem with applications in several areas, ranging from video surveillance, and robotics, to smart healthcare, and multimedia indexing and retrieval, just to mention a few. However, the pervasive diffusion of cameras capable of recording audio also makes available to those applications a complementary modality. Despite the sizable progress made in the area of modeling and recognizing group activities, and actions performed by people in isolation from video, the availability of audio cues has rarely being leveraged. This is even more so in the area of modeling and recognizing binary interactions between humans, where also the use of video has been limited.;This thesis introduces a modeling framework for binary human interactions based on audio and visual cues. The main idea is to describe an interaction with a spatio-temporal trajectory modeling the visual motion cues, and a temporal trajectory modeling the audio cues. This poses the problem of how to fuse temporal trajectories from multiple modalities for the purpose of recognition. We propose a solution whereby trajectories are modeled as the output of kernel state space models. Then, we developed kernel-based methods for the audio-visual fusion that act at the feature level, as well as at the kernel level, by exploiting multiple kernel learning techniques. The approaches have been extensively tested and evaluated with a dataset made of videos obtained from TV shows and Hollywood movies, containing five different interactions. The results show the promise of this approach by producing a significant improvement of the recognition rate when audio cues are exploited, clearly setting the state-of-the-art in this particular application

    Online Geometric Human Interaction Segmentation and Recognition

    Get PDF
    The goal of this work is the temporal localization and recognition of binary people interactions in video. Human-human interaction detection is one of the core problems in video analysis. It has many applications such as in video surveillance, video search and retrieval, human-computer interaction, and behavior analysis for safety and security. Despite the sizeable literature in the area of activity and action modeling and recognition, the vast majority of the approaches make the assumption that the beginning and the end of the video portion containing the action or the activity of interest is known. In other words, while a significant effort has been placed on the recognition, the spatial and temporal localization of activities, i.e. the detection problem, has received considerably less attention. Even more so, if the detection has to be made in an online fashion, as opposed to offline. The latter condition is imposed by almost the totality of the state-of-the-art, which makes it intrinsically unsuited for real-time processing. In this thesis, the problem of event localization and recognition is addressed in an online fashion. The main assumption is that an interaction, or an activity is modeled by a temporal sequence. One of the main challenges is the development of a modeling framework able to capture the complex variability of activities, described by high dimensional features. This is addressed by the combination of linear models with kernel methods. In particular, the parity space theory for detection, based on Euclidean geometry, is augmented to be able to work with kernels, through the use of geometric operators in Hilbert space. While this approach is general, here it is applied to the detection of human interactions. It is tested on a publicly available dataset and on a large and challenging, newly collected dataset. An extensive testing of the approach indicates that it sets a new state-of-the-art under several performance measures, and that it holds the promise to become an effective building block for the analysis in real-time of human behavior from video

    Modeling and Recognizing Binary Human Interactions

    Get PDF
    Recognizing human activities from video is an important step forward towards the long-term goal of performing scene understanding fully automatically. Applications in this domain include, but are not limited to, the automated analysis of video surveillance footage for public and private monitoring, remote patient and elderly home monitoring, video archiving, search and retrieval, human-computer interaction, and robotics. While recent years have seen a concentration of works focusing on modeling and recognizing either group activities, or actions performed by people in isolation, modeling and recognizing binary human-human interactions is a fundamental building block that only recently has started to catalyze the attention of researchers.;This thesis introduces a new modeling framework for binary human-human interactions. The main idea is to describe interactions with spatio-temporal trajectories. Interaction trajectories can then be modeled as the output of dynamical systems, and recognizing interactions entails designing a suitable way for comparing them. This poses several challenges, starting from the type of information that should be captured by the trajectories, which defines the geometry structure of the output space of the systems. In addition, decision functions performing the recognition should account for the fact that the people interacting do not have a predefined ordering. This work addresses those challenges by carefully designing a kernel-based approach that combines non-linear dynamical system modeling with kernel PCA. Experimental results computed on three recently published datasets, clearly show the promise of this approach, where the classification accuracy, and the retrieval precision are comparable or better than the state-of-the-art

    Gerontological Intelligence Test

    Get PDF
    The current study was designed as a preliminary analysis to design an alternative intelligence scale for older adults ages 65 plus. This study was predominantly administered to White participants with a females being the prominent gender (30 females, 14 males). 44 participants were administered the four subtests Analogies, Matrices, Geometric Shapes and Information. The Block Design and Vocabulary from the Wechsler Adult Intelligence Scale was administered to assess the validity of the current study. By creating a more tailored intelligence test for older adults, problems such as fatigue, administrator bias and physical limitations can be addressed. With the population of older adults increasing there is more of a demand for age specific intelligence tests. The results section of this study was able to identify items difficulty and eliminate items that did not provide adequate representation of that particular representation of that subtes

    Average characteristic polynomials for multiple orthogonal polynomial ensembles

    Get PDF
    Multiple orthogonal polynomials (MOP) are a non-definite version of matrix orthogonal polynomials. They are described by a Riemann-Hilbert matrix Y consisting of four blocks Y_{1,1}, Y_{1,2}, Y_{2,1} and Y_{2,2}. In this paper, we show that det Y_{1,1} (det Y_{2,2}) equals the average characteristic polynomial (average inverse characteristic polynomial, respectively) over the probabilistic ensemble that is associated to the MOP. In this way we generalize classical results for orthogonal polynomials, and also some recent results for MOP of type I and type II. We then extend our results to arbitrary products and ratios of characteristic polynomials. In the latter case an important role is played by a matrix-valued version of the Christoffel-Darboux kernel. Our proofs use determinantal identities involving Schur complements, and adaptations of the classical results by Heine, Christoffel and Uvarov.Comment: 32 page
    • …
    corecore