21 research outputs found

    Extraction and Classification of Self-consumable Sport Video Highlights

    Get PDF
    This paper aims to automatically extract and classify self-consumable sport video highlights. For this purpose, we will emphasize the benefits of using play-break sequences as the effective inputs for HMM-based classifier. HMM is used to model the stochastic pattern of high-level states during specific sport highlights which correspond to the sequence of generic audio-visual measurements extracted from raw video data. This paper uses soccer as the domain study, focusing on the extraction and classification of goal, shot and foul highlights. The experiment work which uses183 play-break sequences from 6 soccer matches will be presented to demonstrate the performance of our proposed scheme

    Event detection based on generic characteristics of field-sports

    Get PDF
    In this paper, we propose a generic framework for event detection in broadcast video of multiple different field-sports. Features indicating significant events are selected, and robust detectors built. These features are rooted in generic characteristics common to all genres of field-sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested across multiple genres of field-sports including soccer, rugby, hockey and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable

    Extraction and classification of self-consumable sport video highlights using generic HMM

    Full text link
    This paper aims to automatically extract and classify self-consumable sport video highlights. For this purpose, we will emphasize the benefits of using play-break sequences as the effective inputs for HMMbased classifier. HMM is used to model the stochastic pattern of high-level states during specific sport highlights which correspond to the sequence of generic audio-visual measurements extracted from raw video data. This paper uses soccer as the domain study, focusing on the extraction and classification of goal, shot and foul highlights. The experiment work which uses183 play-break sequences from 6 soccer matches will be presented to demonstrate the performance of our proposed scheme.<br /

    Video Logo Retrieval based on local Features

    Full text link
    Estimation of the frequency and duration of logos in videos is important and challenging in the advertisement industry as a way of estimating the impact of ad purchases. Since logos occupy only a small area in the videos, the popular methods of image retrieval could fail. This paper develops an algorithm called Video Logo Retrieval (VLR), which is an image-to-video retrieval algorithm based on the spatial distribution of local image descriptors that measure the distance between the query image (the logo) and a collection of video images. VLR uses local features to overcome the weakness of global feature-based models such as convolutional neural networks (CNN). Meanwhile, VLR is flexible and does not require training after setting some hyper-parameters. The performance of VLR is evaluated on two challenging open benchmark tasks (SoccerNet and Standford I2V), and compared with other state-of-the-art logo retrieval or detection algorithms. Overall, VLR shows significantly higher accuracy compared with the existing methods.Comment: Accepted by ICIP 20. Contact author: Bochen Guan ([email protected]

    Saying What You're Looking For: Linguistics Meets Video Search

    Full text link
    We present an approach to searching large video corpora for video clips which depict a natural-language query in the form of a sentence. This approach uses compositional semantics to encode subtle meaning that is lost in other systems, such as the difference between two sentences which have identical words but entirely different meaning: "The person rode the horse} vs. \emph{The horse rode the person". Given a video-sentence pair and a natural-language parser, along with a grammar that describes the space of sentential queries, we produce a score which indicates how well the video depicts the sentence. We produce such a score for each video clip in a corpus and return a ranked list of clips. Furthermore, this approach addresses two fundamental problems simultaneously: detecting and tracking objects, and recognizing whether those tracks depict the query. Because both tracking and object detection are unreliable, this uses knowledge about the intended sentential query to focus the tracker on the relevant participants and ensures that the resulting tracks are described by the sentential query. While earlier work was limited to single-word queries which correspond to either verbs or nouns, we show how one can search for complex queries which contain multiple phrases, such as prepositional phrases, and modifiers, such as adverbs. We demonstrate this approach by searching for 141 queries involving people and horses interacting with each other in 10 full-length Hollywood movies.Comment: 13 pages, 8 figure

    Spatio-Temporal Pyramid Matching for Sports Videos

    Get PDF
    In this paper, we address the problem of querying video shots based on content-based matching. Our proposed system automatically partitions a video stream into video shots that maintain continuous movements of objects. Finding video shots of the same category is not an easy task because objects in a video shot change their locations over time. Our spatio-temporal pyramid matching (STPM) is the modified spatial pyramid matching (SPM) [15], which considers temporal information in conjunction with spatial locations to match objects in video shots. In addition, we model the mathematical condition in which temporal information contributes to match video shots. In order to improve the matching performance, dynamic features including movements of objects are considered in addition to static features such as edges of objects. In our experiments, several methods based on different feature sets and matching methods are compared, and our spatio-temporal pyramid matching performed better than existing methods in video matching for sports videos. 1

    Event detection in field sports video using audio-visual features and a support vector machine

    Get PDF
    In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable

    An Overview of Video Shot Clustering and Summarization Techniques for Mobile Applications

    Get PDF
    The problem of content characterization of video programmes is of great interest because video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of video programmes, including movies and sports, that could be helpful for mobile media consumption. In particular we focus our analysis on shot clustering methods and effective video summarization techniques since, in the current video analysis scenario, they facilitate the access to the content and help in quick understanding of the associated semantics. First we consider the shot clustering techniques based on low-level features, using visual, audio and motion information, even combined in a multi-modal fashion. Then we concentrate on summarization techniques, such as static storyboards, dynamic video skimming and the extraction of sport highlights. Discussed summarization methods can be employed in the development of tools that would be greatly useful to most mobile users: in fact these algorithms automatically shorten the original video while preserving most events by highlighting only the important content. The effectiveness of each approach has been analyzed, showing that it mainly depends on the kind of video programme it relates to, and the type of summary or highlights we are focusing on
    corecore