35,283 research outputs found

    Learning directional co-occurrence for human action classification

    Get PDF
    Spatio-temporal interest point (STIP) based methods have shown promising results for human action classification. However, state-of-art works typically utilize bag-of-visual words (BoVW), which focuses on the statistical distribution of features but ignores their inherent structural relationships. To solve this problem, a descriptor, namely directional pairwise feature (DPF), is proposed to encode the mutual direction information between pairwise words, aiming at adding more spatial discriminant to BoVW. Firstly, STIP features are extracted and classified into a set of labeled words. Then in each frame, the DPF is constructed for every pair of words with different labels, according to their assigned directional vector. Finally, DPFs are quantized to be a probability histogram as a representation of human action. The proposed method is evaluated on two challenging datasets, Rochester and UT-interaction, and the results based on chi-squared kernel SVM classifiers confirm that our method can classify human actions with high accuracies.AcousticsEngineering, Electrical & ElectronicEICPCI-S(ISTP)

    Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos

    Full text link
    Every moment counts in action recognition. A comprehensive understanding of human activity in video requires labeling every frame according to the actions occurring, placing multiple labels densely over a video sequence. To study this problem we extend the existing THUMOS dataset and introduce MultiTHUMOS, a new dataset of dense labels over unconstrained internet videos. Modeling multiple, dense labels benefits from temporal relations within and across classes. We define a novel variant of long short-term memory (LSTM) deep networks for modeling these temporal relations via multiple input and output connections. We show that this model improves action labeling accuracy and further enables deeper understanding tasks ranging from structured retrieval to action prediction.Comment: To appear in IJC

    Modeling geometric-temporal context with directional pyramid co-occurrence for action recognition

    Get PDF
    In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved

    Learning to Predict Charges for Criminal Cases with Legal Basis

    Full text link
    The charge prediction task is to determine appropriate charges for a given case, which is helpful for legal assistant systems where the user input is fact description. We argue that relevant law articles play an important role in this task, and therefore propose an attention-based neural network method to jointly model the charge prediction task and the relevant article extraction task in a unified framework. The experimental results show that, besides providing legal basis, the relevant articles can also clearly improve the charge prediction results, and our full model can effectively predict appropriate charges for cases with different expression styles.Comment: 10 pages, accepted by EMNLP 201

    An Efficient Method for online Detection of Polychronous Patterns in Spiking Neural Network

    Get PDF
    Polychronous neural groups are effective structures for the recognition of precise spike-timing patterns but the detection method is an inefficient multi-stage brute force process that works off-line on pre-recorded simulation data. This work presents a new model of polychronous patterns that can capture precise sequences of spikes directly in the neural simulation. In this scheme, each neuron is assigned a randomized code that is used to tag the post-synaptic neurons whenever a spike is transmitted. This creates a polychronous code that preserves the order of pre-synaptic activity and can be registered in a hash table when the post-synaptic neuron spikes. A polychronous code is a sub-component of a polychronous group that will occur, along with others, when the group is active. We demonstrate the representational and pattern recognition ability of polychronous codes on a direction selective visual task involving moving bars that is typical of a computation performed by simple cells in the cortex. The computational efficiency of the proposed algorithm far exceeds existing polychronous group detection methods and is well suited for online detection.Comment: 17 pages, 8 figure

    An Efficient Dual Approach to Distance Metric Learning

    Full text link
    Distance metric learning is of fundamental interest in machine learning because the distance metric employed can significantly affect the performance of many learning methods. Quadratic Mahalanobis metric learning is a popular approach to the problem, but typically requires solving a semidefinite programming (SDP) problem, which is computationally expensive. Standard interior-point SDP solvers typically have a complexity of O(D6.5)O(D^{6.5}) (with DD the dimension of input data), and can thus only practically solve problems exhibiting less than a few thousand variables. Since the number of variables is D(D+1)/2D (D+1) / 2 , this implies a limit upon the size of problem that can practically be solved of around a few hundred dimensions. The complexity of the popular quadratic Mahalanobis metric learning approach thus limits the size of problem to which metric learning can be applied. Here we propose a significantly more efficient approach to the metric learning problem based on the Lagrange dual formulation of the problem. The proposed formulation is much simpler to implement, and therefore allows much larger Mahalanobis metric learning problems to be solved. The time complexity of the proposed method is O(D3)O (D ^ 3) , which is significantly lower than that of the SDP approach. Experiments on a variety of datasets demonstrate that the proposed method achieves an accuracy comparable to the state-of-the-art, but is applicable to significantly larger problems. We also show that the proposed method can be applied to solve more general Frobenius-norm regularized SDP problems approximately
    • …
    corecore