58,047 research outputs found
Movies Tags Extraction Using Deep Learning
Retrieving information from movies is becoming increasingly
demanding due to the enormous amount of multimedia
data generated each day. Not only it helps in efficient
search, archiving and classification of movies, but is also instrumental
in content censorship and recommendation systems.
Extracting key information from a movie and summarizing
it in a few tags which best describe the movie presents
a dedicated challenge and requires an intelligent approach
to automatically analyze the movie. In this paper, we formulate
movies tags extraction problem as a machine learning
classification problem and train a Convolution Neural Network
(CNN) on a carefully constructed tag vocabulary. Our
proposed technique first extracts key frames from a movie
and applies the trained classifier on the key frames. The
predictions from the classifier are assigned scores and are
filtered based on their relative strengths to generate a compact
set of most relevant key tags. We performed a rigorous
subjective evaluation of our proposed technique for a
wide variety of movies with different experiments. The evaluation
results presented in this paper demonstrate that our
proposed approach can efficiently extract the key tags of a
movie with a good accuracy
Automatic annotation of tennis games: An integration of audio, vision, and learning
Fully automatic annotation of tennis game using broadcast video is a task with a great potential but with enormous challenges. In this paper we describe our approach to this task, which integrates computer vision, machine listening, and machine learning. At the low level processing, we improve upon our previously proposed state-of-the-art tennis ball tracking algorithm and employ audio signal processing techniques to detect key events and construct features for classifying the events. At high level analysis, we model event classification as a sequence labelling problem, and investigate four machine learning techniques using simulated event sequences. Finally, we evaluate our proposed approach on three real world tennis games, and discuss the interplay between audio, vision and learning. To the best of our knowledge, our system is the only one that can annotate tennis game at such a detailed level
Strategies for Searching Video Content with Text Queries or Video Examples
The large number of user-generated videos uploaded on to the Internet
everyday has led to many commercial video search engines, which mainly rely on
text metadata for search. However, metadata is often lacking for user-generated
videos, thus these videos are unsearchable by current search engines.
Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity
problem by directly analyzing the visual and audio streams of each video. CBVR
encompasses multiple research topics, including low-level feature design,
feature fusion, semantic detector training and video search/reranking. We
present novel strategies in these topics to enhance CBVR in both accuracy and
speed under different query inputs, including pure textual queries and query by
video examples. Our proposed strategies have been incorporated into our
submission for the TRECVID 2014 Multimedia Event Detection evaluation, where
our system outperformed other submissions in both text queries and video
example queries, thus demonstrating the effectiveness of our proposed
approaches
- …