Search CORE

66,044 research outputs found

Time-Sensitive Topic Models for Action Recognition in Videos

Author: Emonet Remi
Odobez Jean-Marc
Tavenard Romain
Publication venue
Publication date: 19/12/2013
Field of study

In this paper, we postulate that temporal information is important for action recognition in videos. Keeping temporal information, videos are represented as word×time documents. We propose to use time-sensitive probabilistic topic models and we extend them for the con-text of supervised learning. Our time-sensitive approach is com-pared to both PLSA and Bag-of-Words. Our approach is shown to both capture semantics from data and yield classification perfor-mance comparable to other methods, outperforming them when the amount of training data is low. 1

Infoscience - École polytechnique fédérale de Lausanne

CiteSeerX

Crossref

Latent Semantic Learning with Structured Sparse Representation for Human Action Recognition

Author: Balasubramanian
Belkin
Blei
Cheng
Donoho
Hofmann
Jenatton
Lafon
Liu
Lu
Niebles
Olshausen
Parameswaran
Tibshirani
Turaga
Wang
Wright
Yan
Yuxin Peng
Zhiwu Lu
Publication venue: 'Elsevier BV'
Publication date: 22/09/2011
Field of study

This paper proposes a novel latent semantic learning method for extracting high-level features (i.e. latent semantics) from a large vocabulary of abundant mid-level features (i.e. visual keywords) with structured sparse representation, which can help to bridge the semantic gap in the challenging task of human action recognition. To discover the manifold structure of midlevel features, we develop a spectral embedding approach to latent semantic learning based on L1-graph, without the need to tune any parameter for graph construction as a key step of manifold learning. More importantly, we construct the L1-graph with structured sparse representation, which can be obtained by structured sparse coding with its structured sparsity ensured by novel L1-norm hypergraph regularization over mid-level features. In the new embedding space, we learn latent semantics automatically from abundant mid-level features through spectral clustering. The learnt latent semantics can be readily used for human action recognition with SVM by defining a histogram intersection kernel. Different from the traditional latent semantic analysis based on topic models, our latent semantic learning method can explore the manifold structure of mid-level features in both L1-graph construction and spectral embedding, which results in compact but discriminative high-level features. The experimental results on the commonly used KTH action dataset and unconstrained YouTube action dataset show the superior performance of our method.Comment: The short version of this paper appears in ICCV 201

arXiv.org e-Print Archive

Crossref

Learning Behavioural Context

Author: A. Gupta
A. Rabinovich
C. Galleguillos
C.C. Loy
D.M. Blei
G. Heitz
H. Buxton
I. Biederman
J. Li
J. Li
J. Sherrah
K.P. Murphy
L. Wolf
L. Zelnik-Manor
M. Bar
M. Bar
M. Bar
M. Marszalek
M. Yang
P. Carbonetto
S. Ali
S. Gong
S. Gong
S. Kumar
S. Palmer
T. Hofmann
T. Hofmann
W. Zheng
W. Zheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The original publication is available at www.springerlink.co

Crossref

Queen Mary Research Online

Trespassing the Boundaries: Labeling Temporal Bounds for Object Interactions in Egocentric Video

Author: Damen Dima
Mayol-Cuevas Walterio
Moltisanti Davide
Wray Michael
Publication venue
Publication date: 26/07/2017
Field of study

Manual annotations of temporal bounds for object interactions (i.e. start and end times) are typical training input to recognition, localization and detection algorithms. For three publicly available egocentric datasets, we uncover inconsistencies in ground truth temporal bounds within and across annotators and datasets. We systematically assess the robustness of state-of-the-art approaches to changes in labeled temporal bounds, for object interaction recognition. As boundaries are trespassed, a drop of up to 10% is observed for both Improved Dense Trajectories and Two-Stream Convolutional Neural Network. We demonstrate that such disagreement stems from a limited understanding of the distinct phases of an action, and propose annotating based on the Rubicon Boundaries, inspired by a similarly named cognitive model, for consistent temporal bounds of object interactions. Evaluated on a public dataset, we report a 4% increase in overall accuracy, and an increase in accuracy for 55% of classes when Rubicon Boundaries are used for temporal annotations.Comment: ICCV 201

arXiv.org e-Print Archive

Explore Bristol Research

First impressions: A survey on vision-based apparent personality trait analysis

Author: Andújar Gran Carlos Antonio
Baró Solé Xavier
Escalante Balderas Hugo Jair
Escalera Guerrero Sergio
Guyon Isabelle
Güçlü Umut
Güçlütürk Yagmur
Jacques Junior Julio
Pérez Quintana Marc
van Gerven Marcel A. J.
van Lier Rob
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

VBN

Radboud Repository

Automatic Understanding of Image and Video Advertisements

Author: Agha Zuha
Hussain Zaeem
Kovashka Adriana
Ong Nathan
Thomas Christopher
Ye Keren
Zhang Mingda
Zhang Xiaozhong
Publication venue
Publication date: 10/07/2017
Field of study

There is more to images than their objective physical content: for example, advertisements are created to persuade a viewer to take a certain action. We propose the novel problem of automatic advertisement understanding. To enable research on this problem, we create two datasets: an image dataset of 64,832 image ads, and a video dataset of 3,477 ads. Our data contains rich annotations encompassing the topic and sentiment of the ads, questions and answers describing what actions the viewer is prompted to take and the reasoning that the ad presents to persuade the viewer ("What should I do according to this ad, and why should I do it?"), and symbolic references ads make (e.g. a dove symbolizes peace). We also analyze the most common persuasive strategies ads use, and the capabilities that computer vision systems should have to understand these strategies. We present baseline classification results for several prediction tasks, including automatically answering questions about the messages of the ads.Comment: To appear in CVPR 2017; data available on http://cs.pitt.edu/~kovashka/ad

arXiv.org e-Print Archive

Crossref