232 research outputs found

    Improving the Accuracy of Action Classification Using View-Dependent Context Information

    Get PDF
    Proceedings of: 6th International Conference, HAIS 2011, Wroclaw, Poland, May 23-25, 2011This paper presents a human action recognition system that decomposes the task in two subtasks. First, a view-independent classifier, shared between the multiple views to analyze, is applied to obtain an initial guess of the posterior distribution of the performed action. Then, this posterior distribution is combined with view based knowledge to improve the action classification. This allows to reuse the view-independent component when a new view has to be analyzed, needing to only specify the view dependent knowledge. An example of the application of the system into an smart home domain is discussed.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/ TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/ TIC-1485) and DPS2008-07029-C02-02.Publicad

    From Traditional to Modern : Domain Adaptation for Action Classification in Short Social Video Clips

    Full text link
    Short internet video clips like vines present a significantly wild distribution compared to traditional video datasets. In this paper, we focus on the problem of unsupervised action classification in wild vines using traditional labeled datasets. To this end, we use a data augmentation based simple domain adaptation strategy. We utilise semantic word2vec space as a common subspace to embed video features from both, labeled source domain and unlablled target domain. Our method incrementally augments the labeled source with target samples and iteratively modifies the embedding function to bring the source and target distributions together. Additionally, we utilise a multi-modal representation that incorporates noisy semantic information available in form of hash-tags. We show the effectiveness of this simple adaptation technique on a test set of vines and achieve notable improvements in performance.Comment: 9 pages, GCPR, 201

    Fusion of Single View Soft k-NN Classifiers for Multicamera Human Action Recognition

    Get PDF
    Proceedings of: 5th International Conference on Hybrid Artificial Intelligence Systems (HAIS 2010). San Sebastián, Spain, June 23-25, 2010This paper presents two different classifier fusion algorithms applied in the domain of Human Action Recognition from video. A set of cameras observes a person performing an action from a predefined set. For each camera view a 2D descriptor is computed and a posterior on the performed activity is obtained using a soft classifier. These posteriors are combined using voting and a bayesian network to obtain a single belief measure to use for the final decision on the performed action. Experiments are conducted with different low level frame descriptors on the IXMAS dataset, achieving results comparable to state of the art 3D proposals, but only performing 2D processing.This work was supported in part by Projects CICYT TIN2008-06742-C02-02/TSI, CICYT TEC2008-06732-C02-02/TEC, CAM CONTEXTS (S2009/TIC-1485) and DPS2008-07029-C02-02Publicad

    Comparative Evaluation of Action Recognition Methods via Riemannian Manifolds, Fisher Vectors and GMMs: Ideal and Challenging Conditions

    Full text link
    We present a comparative evaluation of various techniques for action recognition while keeping as many variables as possible controlled. We employ two categories of Riemannian manifolds: symmetric positive definite matrices and linear subspaces. For both categories we use their corresponding nearest neighbour classifiers, kernels, and recent kernelised sparse representations. We compare against traditional action recognition techniques based on Gaussian mixture models and Fisher vectors (FVs). We evaluate these action recognition techniques under ideal conditions, as well as their sensitivity in more challenging conditions (variations in scale and translation). Despite recent advancements for handling manifolds, manifold based techniques obtain the lowest performance and their kernel representations are more unstable in the presence of challenging conditions. The FV approach obtains the highest accuracy under ideal conditions. Moreover, FV best deals with moderate scale and translation changes

    Survey on Vision-based Path Prediction

    Full text link
    Path prediction is a fundamental task for estimating how pedestrians or vehicles are going to move in a scene. Because path prediction as a task of computer vision uses video as input, various information used for prediction, such as the environment surrounding the target and the internal state of the target, need to be estimated from the video in addition to predicting paths. Many prediction approaches that include understanding the environment and the internal state have been proposed. In this survey, we systematically summarize methods of path prediction that take video as input and and extract features from the video. Moreover, we introduce datasets used to evaluate path prediction methods quantitatively.Comment: DAPI 201

    Multicamera Action Recognition with Canonical Correlation Analysis and Discriminative Sequence Classification

    Get PDF
    Proceedings of: 4th International Work-Conference on the Interplay Between Natural and Artificial Computation, IWINAC 2011, La Palma, Canary Islands, Spain, May 30 - June 3, 2011.This paper presents a feature fusion approach to the recognition of human actions from multiple cameras that avoids the computation of the 3D visual hull. Action descriptors are extracted for each one of the camera views available and projected into a common subspace that maximizes the correlation between each one of the components of the projections. That common subspace is learned using Probabilistic Canonical Correlation Analysis. The action classification is made in that subspace using a discriminative classifier. Results of the proposed method are shown for the classification of the IXMAS dataset.Publicad

    Designing a topological algorithm for 3D activity recognition

    Get PDF
    Voxel carving is a non-invasive and low-cost technique that is used for the reconstruction of a 3D volume from images captured from a set of cameras placed around the object of interest. In this paper we propose a method to topologically analyze a video sequence of 3D reconstructions representing a tennis player performing different forehand and backhand strokes with the aim of providing an approach that could be useful in other sport activities

    Human Action Recognition Based on Temporal Pyramid of Key Poses Using RGB-D Sensors

    Get PDF
    Human action recognition is a hot research topic in computer vision, mainly due to the high number of related applications, such as surveillance, human computer interaction, or assisted living. Low cost RGB-D sensors have been extensively used in this field. They can provide skeleton joints, which represent a compact and effective representation of the human posture. This work proposes an algorithm for human action recognition where the features are computed from skeleton joints. A sequence of skeleton features is represented as a set of key poses, from which histograms are extracted. The temporal structure of the sequence is kept using a temporal pyramid of key poses. Finally, a multi-class SVM performs the classification task. The algorithm optimization through evolutionary computation allows to reach results comparable to the state-of-the-art on the MSR Action3D dataset.This work was supported by a STSM Grant from COST Action IC1303 AAPELE - Architectures, Algorithms and Platforms for Enhanced Living Environments

    A 3D Human Posture Approach for Activity Recognition Based on Depth Camera

    Get PDF
    Human activity recognition plays an important role in the context of Ambient Assisted Living (AAL), providing useful tools to improve people quality of life. This work presents an activity recognition algorithm based on the extraction of skeleton joints from a depth camera. The system describes an activity using a set of few and basic postures extracted by means of the X-means clustering algorithm. A multi-class Support Vector Machine, trained with the Sequential Minimal Optimization is employed to perform the classification. The system is evaluated on two public datasets for activity recognition which have different skeleton models, the CAD-60 with 15 joints and the TST with 25 joints. The proposed approach achieves precision/recall performances of 99.8 % on CAD-60 and 97.2 %/91.7 % on TST. The results are promising for an applied use in the context of AAL

    Wave Functions, Quantum Diffusion, and Scaling Exponents in Golden-Mean Quasiperiodic Tilings

    Full text link
    We study the properties of wave functions and the wave-packet dynamics in quasiperiodic tight-binding models in one, two, and three dimensions. The atoms in the one-dimensional quasiperiodic chains are coupled by weak and strong bonds aligned according to the Fibonacci sequence. The associated d-dimensional quasiperiodic tilings are constructed from the direct product of d such chains, which yields either the hypercubic tiling or the labyrinth tiling. This approach allows us to consider rather large systems numerically. We show that the wave functions of the system are multifractal and that their properties can be related to the structure of the system in the regime of strong quasiperiodic modulation by a renormalization group (RG) approach. We also study the dynamics of wave packets to get information about the electronic transport properties. In particular, we investigate the scaling behaviour of the return probability of the wave packet with time. Applying again the RG approach we show that in the regime of strong quasiperiodic modulation the return probability is governed by the underlying quasiperiodic structure. Further, we also discuss lower bounds for the scaling exponent of the width of the wave packet and propose a modified lower bound for the absolute continuous regime.Comment: 25 pages, 13 figure
    corecore