58,047 research outputs found

    Movies Tags Extraction Using Deep Learning

    Get PDF
    Retrieving information from movies is becoming increasingly demanding due to the enormous amount of multimedia data generated each day. Not only it helps in efficient search, archiving and classification of movies, but is also instrumental in content censorship and recommendation systems. Extracting key information from a movie and summarizing it in a few tags which best describe the movie presents a dedicated challenge and requires an intelligent approach to automatically analyze the movie. In this paper, we formulate movies tags extraction problem as a machine learning classification problem and train a Convolution Neural Network (CNN) on a carefully constructed tag vocabulary. Our proposed technique first extracts key frames from a movie and applies the trained classifier on the key frames. The predictions from the classifier are assigned scores and are filtered based on their relative strengths to generate a compact set of most relevant key tags. We performed a rigorous subjective evaluation of our proposed technique for a wide variety of movies with different experiments. The evaluation results presented in this paper demonstrate that our proposed approach can efficiently extract the key tags of a movie with a good accuracy

    Automatic annotation of tennis games: An integration of audio, vision, and learning

    Get PDF
    Fully automatic annotation of tennis game using broadcast video is a task with a great potential but with enormous challenges. In this paper we describe our approach to this task, which integrates computer vision, machine listening, and machine learning. At the low level processing, we improve upon our previously proposed state-of-the-art tennis ball tracking algorithm and employ audio signal processing techniques to detect key events and construct features for classifying the events. At high level analysis, we model event classification as a sequence labelling problem, and investigate four machine learning techniques using simulated event sequences. Finally, we evaluate our proposed approach on three real world tennis games, and discuss the interplay between audio, vision and learning. To the best of our knowledge, our system is the only one that can annotate tennis game at such a detailed level

    Strategies for Searching Video Content with Text Queries or Video Examples

    Full text link
    The large number of user-generated videos uploaded on to the Internet everyday has led to many commercial video search engines, which mainly rely on text metadata for search. However, metadata is often lacking for user-generated videos, thus these videos are unsearchable by current search engines. Therefore, content-based video retrieval (CBVR) tackles this metadata-scarcity problem by directly analyzing the visual and audio streams of each video. CBVR encompasses multiple research topics, including low-level feature design, feature fusion, semantic detector training and video search/reranking. We present novel strategies in these topics to enhance CBVR in both accuracy and speed under different query inputs, including pure textual queries and query by video examples. Our proposed strategies have been incorporated into our submission for the TRECVID 2014 Multimedia Event Detection evaluation, where our system outperformed other submissions in both text queries and video example queries, thus demonstrating the effectiveness of our proposed approaches
    corecore