Search CORE

533 research outputs found

User-aware Video Coding Based on Semantic Video Understanding and Enhancing

Author: Chia-Hu Chang
Yu-Tzu Lin
Publication venue: 'IntechOpen'
Publication date: 05/07/2011
Field of study

Evaluating Two-Stream CNN for Video Classification

Author: Jain M.
Ji S.
Krizhevsky A.
LeCun Y.
Mikolov T.
Peng X.
Schmidhuber J.
Simonyan K.
Simonyan K.
Socher R.
Soomro K.
Sutskever I.
Szegedy C.
Venugopalan S.
Ye G.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/04/2015
Field of study

Videos contain very rich semantic information. Traditional hand-crafted features are known to be inadequate in analyzing complex video semantics. Inspired by the huge success of the deep learning methods in analyzing image, audio and text data, significant efforts are recently being devoted to the design of deep nets for video analytics. Among the many practical needs, classifying videos (or video clips) based on their major semantic categories (e.g., "skiing") is useful in many applications. In this paper, we conduct an in-depth study to investigate important implementation options that may affect the performance of deep nets on video classification. Our evaluations are conducted on top of a recent two-stream convolutional neural network (CNN) pipeline, which uses both static frames and motion optical flows, and has demonstrated competitive performance against the state-of-the-art methods. In order to gain insights and to arrive at a practical guideline, many important options are studied, including network architectures, model fusion, learning parameters and the final prediction methods. Based on the evaluations, very competitive results are attained on two popular video classification benchmarks. We hope that the discussions and conclusions from this work can help researchers in related fields to quickly set up a good basis for further investigations along this very promising direction.Comment: ACM ICMR'1

arXiv.org e-Print Archive

Crossref

Effective multimedia event analysis in large-scale videos

Author: Yu Litao
Publication venue: 'University of Queensland Library'
Publication date: 16/08/2016
Field of study

University of Queensland eSpace

Language Grounding in Massive Online Data

Author: Chen Jianfu
Publication venue
Publication date: 01/01/2015
Field of study

PhilPapers

Combining the Right Features for Complex Event Recognition

Author: Bangpeng Yao
Daphne Koller
Kevin Tang
Li Fei-fei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

In this paper, we tackle the problem of combining fea-tures extracted from video for complex event recognition. Feature combination is an especially relevant task in video data, as there are many features we can extract, rang-ing from image features computed from individual frames to video features that take temporal information into ac-count. To combine features effectively, we propose a method that is able to be selective of different subsets of features, as some features or feature combinations may be unin-formative for certain classes. We introduce a hierarchi-cal method for combining features based on the AND/OR graph structure, where nodes in the graph represent com-binations of different sets of features. Our method auto-matically learns the structure of the AND/OR graph using score-based structure learning, and we introduce an infer-ence procedure that is able to efficiently compute structure scores. We present promising results and analysis on th

CiteSeerX

Crossref