35,283 research outputs found
Learning directional co-occurrence for human action classification
Spatio-temporal interest point (STIP) based methods have shown promising results for human action classification. However, state-of-art works typically utilize bag-of-visual words (BoVW), which focuses on the statistical distribution of features but ignores their inherent structural relationships. To solve this problem, a descriptor, namely directional pairwise feature (DPF), is proposed to encode the mutual direction information between pairwise words, aiming at adding more spatial discriminant to BoVW. Firstly, STIP features are extracted and classified into a set of labeled words. Then in each frame, the DPF is constructed for every pair of words with different labels, according to their assigned directional vector. Finally, DPFs are quantized to be a probability histogram as a representation of human action. The proposed method is evaluated on two challenging datasets, Rochester and UT-interaction, and the results based on chi-squared kernel SVM classifiers confirm that our method can classify human actions with high accuracies.AcousticsEngineering, Electrical & ElectronicEICPCI-S(ISTP)
Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos
Every moment counts in action recognition. A comprehensive understanding of
human activity in video requires labeling every frame according to the actions
occurring, placing multiple labels densely over a video sequence. To study this
problem we extend the existing THUMOS dataset and introduce MultiTHUMOS, a new
dataset of dense labels over unconstrained internet videos. Modeling multiple,
dense labels benefits from temporal relations within and across classes. We
define a novel variant of long short-term memory (LSTM) deep networks for
modeling these temporal relations via multiple input and output connections. We
show that this model improves action labeling accuracy and further enables
deeper understanding tasks ranging from structured retrieval to action
prediction.Comment: To appear in IJC
Modeling geometric-temporal context with directional pyramid co-occurrence for action recognition
In this paper, we present a new geometric-temporal representation for visual action recognition based on local spatio-temporal features. First, we propose a modified covariance descriptor under the log-Euclidean Riemannian metric to represent the spatio-temporal cuboids detected in the video sequences. Compared with previously proposed covariance descriptors, our descriptor can be measured and clustered in Euclidian space. Second, to capture the geometric-temporal contextual information, we construct a directional pyramid co-occurrence matrix (DPCM) to describe the spatio-temporal distribution of the vector-quantized local feature descriptors extracted from a video. DPCM characterizes the co-occurrence statistics of local features as well as the spatio-temporal positional relationships among the concurrent features. These statistics provide strong descriptive power for action recognition. To use DPCM for action recognition, we propose a directional pyramid co-occurrence matching kernel to measure the similarity of videos. The proposed method achieves the state-of-the-art performance and improves on the recognition performance of the bag-of-visual-words (BOVWs) models by a large margin on six public data sets. For example, on the KTH data set, it achieves 98.78% accuracy while the BOVW approach only achieves 88.06%. On both Weizmann and UCF CIL data sets, the highest possible accuracy of 100% is achieved
Learning to Predict Charges for Criminal Cases with Legal Basis
The charge prediction task is to determine appropriate charges for a given
case, which is helpful for legal assistant systems where the user input is fact
description. We argue that relevant law articles play an important role in this
task, and therefore propose an attention-based neural network method to jointly
model the charge prediction task and the relevant article extraction task in a
unified framework. The experimental results show that, besides providing legal
basis, the relevant articles can also clearly improve the charge prediction
results, and our full model can effectively predict appropriate charges for
cases with different expression styles.Comment: 10 pages, accepted by EMNLP 201
An Efficient Method for online Detection of Polychronous Patterns in Spiking Neural Network
Polychronous neural groups are effective structures for the recognition of
precise spike-timing patterns but the detection method is an inefficient
multi-stage brute force process that works off-line on pre-recorded simulation
data. This work presents a new model of polychronous patterns that can capture
precise sequences of spikes directly in the neural simulation. In this scheme,
each neuron is assigned a randomized code that is used to tag the post-synaptic
neurons whenever a spike is transmitted. This creates a polychronous code that
preserves the order of pre-synaptic activity and can be registered in a hash
table when the post-synaptic neuron spikes. A polychronous code is a
sub-component of a polychronous group that will occur, along with others, when
the group is active. We demonstrate the representational and pattern
recognition ability of polychronous codes on a direction selective visual task
involving moving bars that is typical of a computation performed by simple
cells in the cortex. The computational efficiency of the proposed algorithm far
exceeds existing polychronous group detection methods and is well suited for
online detection.Comment: 17 pages, 8 figure
An Efficient Dual Approach to Distance Metric Learning
Distance metric learning is of fundamental interest in machine learning
because the distance metric employed can significantly affect the performance
of many learning methods. Quadratic Mahalanobis metric learning is a popular
approach to the problem, but typically requires solving a semidefinite
programming (SDP) problem, which is computationally expensive. Standard
interior-point SDP solvers typically have a complexity of (with
the dimension of input data), and can thus only practically solve problems
exhibiting less than a few thousand variables. Since the number of variables is
, this implies a limit upon the size of problem that can
practically be solved of around a few hundred dimensions. The complexity of the
popular quadratic Mahalanobis metric learning approach thus limits the size of
problem to which metric learning can be applied. Here we propose a
significantly more efficient approach to the metric learning problem based on
the Lagrange dual formulation of the problem. The proposed formulation is much
simpler to implement, and therefore allows much larger Mahalanobis metric
learning problems to be solved. The time complexity of the proposed method is
, which is significantly lower than that of the SDP approach.
Experiments on a variety of datasets demonstrate that the proposed method
achieves an accuracy comparable to the state-of-the-art, but is applicable to
significantly larger problems. We also show that the proposed method can be
applied to solve more general Frobenius-norm regularized SDP problems
approximately
- …