10,895 research outputs found
Pooling-Invariant Image Feature Learning
Unsupervised dictionary learning has been a key component in state-of-the-art
computer vision recognition architectures. While highly effective methods exist
for patch-based dictionary learning, these methods may learn redundant features
after the pooling stage in a given early vision architecture. In this paper, we
offer a novel dictionary learning scheme to efficiently take into account the
invariance of learned features after the spatial pooling stage. The algorithm
is built on simple clustering, and thus enjoys efficiency and scalability. We
discuss the underlying mechanism that justifies the use of clustering
algorithms, and empirically show that the algorithm finds better dictionaries
than patch-based methods with the same dictionary size
Segmental Spatiotemporal CNNs for Fine-grained Action Segmentation
Joint segmentation and classification of fine-grained actions is important
for applications of human-robot interaction, video surveillance, and human
skill evaluation. However, despite substantial recent progress in large-scale
action classification, the performance of state-of-the-art fine-grained action
recognition approaches remains low. We propose a model for action segmentation
which combines low-level spatiotemporal features with a high-level segmental
classifier. Our spatiotemporal CNN is comprised of a spatial component that
uses convolutional filters to capture information about objects and their
relationships, and a temporal component that uses large 1D convolutional
filters to capture information about how object relationships change across
time. These features are used in tandem with a semi-Markov model that models
transitions from one action to another. We introduce an efficient constrained
segmental inference algorithm for this model that is orders of magnitude faster
than the current approach. We highlight the effectiveness of our Segmental
Spatiotemporal CNN on cooking and surgical action datasets for which we observe
substantially improved performance relative to recent baseline methods.Comment: Updated from the ECCV 2016 version. We fixed an important
mathematical error and made the section on segmental inference cleare
- …