10,945 research outputs found

    Content-based Video Retrieval by Integrating Spatio-Temporal and Stochastic Recognition of Events

    Get PDF
    As amounts of publicly available video data grow the need to query this data efficiently becomes significant. Consequently content-based retrieval of video data turns out to be a challenging and important problem. We address the specific aspect of inferring semantics automatically from raw video data. In particular, we introduce a new video data model that supports the integrated use of two different approaches for mapping low-level features to high-level concepts. Firstly, the model is extended with a rule-based approach that supports spatio-temporal formalization of high-level concepts, and then with a stochastic approach. Furthermore, results on real tennis video data are presented, demonstrating the validity of both approaches, as well us advantages of their integrated us

    An audio-based sports video segmentation and event detection algorithm

    Get PDF
    In this paper, we present an audio-based event detection algorithm shown to be effective when applied to Soccer video. The main benefit of this approach is the ability to recognise patterns that display high levels of crowd response correlated to key events. The soundtrack from a Soccer sequence is first parameterised using Mel-frequency Cepstral coefficients. It is then segmented into homogenous components using a windowing algorithm with a decision process based on Bayesian model selection. This decision process eliminated the need for defining a heuristic set of rules for segmentation. Each audio segment is then labelled using a series of Hidden Markov model (HMM) classifiers, each a representation of one of 6 predefined semantic content classes found in Soccer video. Exciting events are identified as those segments belonging to a crowd cheering class. Experimentation indicated that the algorithm was more effective for classifying crowd response when compared to traditional model-based segmentation and classification techniques

    Image segmentation and feature extraction for recognizing strokes in tennis game videos

    Get PDF
    This paper addresses the problem of recognizing human actions from video. Particularly, the case of recognizing events in tennis game videos is analyzed. Driven by our domain knowledge, a robust player segmentation algorithm is developed real video data. Further, we introduce a number of novel features to be extracted for our particular application. Different feature combinations are investigated in order to find the optimal one. Finally, recognition results for different classes of tennis strokes using automatic learning capability of Hidden Markov Models (HMMs) are presented. The experimental results demonstrate that our method is close to realizing statistics of tennis games automatically using ordinary TV broadcast videos

    Audio-based event detection for sports video

    Get PDF
    In this paper, we present an audio-based event detection approach shown to be effective when applied to the Sports broadcast data. The main benefit of this approach is the ability to recognise patterns that indicate high levels of crowd response which can be correlated to key events. By applying Hidden Markov Model-based classifiers, where the predefined content classes are parameterised using Mel-Frequency Cepstral Coefficients, we were able to eliminate the need for defining a heuristic set of rules to determine event detection, thus avoiding a two-class approach shown not to be suitable for this problem. Experimentation indicated that this is an effective method for classifying crowd response in Soccer matches, thus providing a basis for automatic indexing and summarisation

    Event detection in field sports video using audio-visual features and a support vector machine

    Get PDF
    In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable

    Image Retrieval Using Circular Hidden Markov Models with a Garbage State

    Get PDF
    Shape-based image and video retrieval is an active research topic in multimedia information retrieval. It is well known that there are significant variations in shapes of the same category extracted from images and videos. In this paper, we propose to use circular hidden Markov models for shape recognition and image retrieval. In our approach, we use a garbage state to explicitly deal with shape mismatch caused by shape deformation and occlusion. We will propose a modiÂŻed circular hidden Markov model (HMM)for shape-based image retrieval and then use circular HMMs with a garbage state to further improve the performance. To evaluate the proposed algorithms, we have conducted experiments using the database of the MPEG-7 Core Experiments Shape-1, Part B. The experiments show that our approaches are robust to shape deformations such as shape variations and occlusion. The performance of our approaches is comparable to that of the state-of-the-art shape-based image retrieval systems in terms of accuracy and speed

    A Neural Multi-sequence Alignment TeCHnique (NeuMATCH)

    Full text link
    The alignment of heterogeneous sequential data (video to text) is an important and challenging problem. Standard techniques for this task, including Dynamic Time Warping (DTW) and Conditional Random Fields (CRFs), suffer from inherent drawbacks. Mainly, the Markov assumption implies that, given the immediate past, future alignment decisions are independent of further history. The separation between similarity computation and alignment decision also prevents end-to-end training. In this paper, we propose an end-to-end neural architecture where alignment actions are implemented as moving data between stacks of Long Short-term Memory (LSTM) blocks. This flexible architecture supports a large variety of alignment tasks, including one-to-one, one-to-many, skipping unmatched elements, and (with extensions) non-monotonic alignment. Extensive experiments on semi-synthetic and real datasets show that our algorithm outperforms state-of-the-art baselines.Comment: Accepted at CVPR 2018 (Spotlight). arXiv file includes the paper and the supplemental materia

    Phonetic Searching

    Get PDF
    An improved method and apparatus is disclosed which uses probabilistic techniques to map an input search string with a prestored audio file, and recognize certain portions of a search string phonetically. An improved interface is disclosed which permits users to input search strings, linguistics, phonetics, or a combination of both, and also allows logic functions to be specified by indicating how far separated specific phonemes are in time.Georgia Tech Research Corporatio
    • 

    corecore