3,587 research outputs found
Compressive Sequential Learning for Action Similarity Labeling
Human action recognition in videos has been extensively studied in recent years due to its wide range of applications. Instead of classifying video sequences into a number of action categories, in this paper, we focus on a particular problem of action similarity labeling (ASLAN), which aims at verifying whether a pair of videos contain the same type of action or not. To address this challenge, a novel approach called compressive sequential learning (CSL) is proposed by leveraging the compressive sensing theory and sequential learning. We first project data points to a low-dimensional space by effectively exploring an important property in compressive sensing: the restricted isometry property. In particular, a very sparse measurement matrix is adopted to reduce the dimensionality efficiently. We then learn an ensemble classifier for measuring similarities between pairwise videos by iteratively minimizing its empirical risk with the AdaBoost strategy on the training set. Unlike conventional AdaBoost, the weak learner for each iteration is not explicitly defined and its parameters are learned through greedy optimization. Furthermore, an alternative of CSL named compressive sequential encoding is developed as an encoding technique and followed by a linear classifier to address the similarity-labeling problem. Our method has been systematically evaluated on four action data sets: ASLAN, KTH, HMDB51, and Hollywood2, and the results show the effectiveness and superiority of our method for ASLAN
Multimodal Grounding for Language Processing
This survey discusses how recent developments in multimodal processing
facilitate conceptual grounding of language. We categorize the information flow
in multimodal processing with respect to cognitive models of human information
processing and analyze different methods for combining multimodal
representations. Based on this methodological inventory, we discuss the benefit
of multimodal grounding for a variety of language processing tasks and the
challenges that arise. We particularly focus on multimodal grounding of verbs
which play a crucial role for the compositional power of language.Comment: The paper has been published in the Proceedings of the 27 Conference
of Computational Linguistics. Please refer to this version for citations:
https://www.aclweb.org/anthology/papers/C/C18/C18-1197
Pose Embeddings: A Deep Architecture for Learning to Match Human Poses
We present a method for learning an embedding that places images of humans in
similar poses nearby. This embedding can be used as a direct method of
comparing images based on human pose, avoiding potential challenges of
estimating body joint positions. Pose embedding learning is formulated under a
triplet-based distance criterion. A deep architecture is used to allow learning
of a representation capable of making distinctions between different poses.
Experiments on human pose matching and retrieval from video data demonstrate
the potential of the method
Adaptive Nonparametric Image Parsing
In this paper, we present an adaptive nonparametric solution to the image
parsing task, namely annotating each image pixel with its corresponding
category label. For a given test image, first, a locality-aware retrieval set
is extracted from the training data based on super-pixel matching similarities,
which are augmented with feature extraction for better differentiation of local
super-pixels. Then, the category of each super-pixel is initialized by the
majority vote of the -nearest-neighbor super-pixels in the retrieval set.
Instead of fixing as in traditional non-parametric approaches, here we
propose a novel adaptive nonparametric approach which determines the
sample-specific k for each test image. In particular, is adaptively set to
be the number of the fewest nearest super-pixels which the images in the
retrieval set can use to get the best category prediction. Finally, the initial
super-pixel labels are further refined by contextual smoothing. Extensive
experiments on challenging datasets demonstrate the superiority of the new
solution over other state-of-the-art nonparametric solutions.Comment: 11 page
- …