88,048 research outputs found

    Table tennis event detection and classification

    Get PDF
    It is well understood that multiple video cameras and computer vision (CV) technology can be used in sport for match officiating, statistics and player performance analysis. A review of the literature reveals a number of existing solutions, both commercial and theoretical, within this domain. However, these solutions are expensive and often complex in their installation. The hypothesis for this research states that by considering only changes in ball motion, automatic event classification is achievable with low-cost monocular video recording devices, without the need for 3-dimensional (3D) positional ball data and representation. The focus of this research is a rigorous empirical study of low cost single consumer-grade video camera solutions applied to table tennis, confirming that monocular CV based detected ball location data contains sufficient information to enable key match-play events to be recognised and measured. In total a library of 276 event-based video sequences, using a range of recording hardware, were produced for this research. The research has four key considerations: i) an investigation into an effective recording environment with minimum configuration and calibration, ii) the selection and optimisation of a CV algorithm to detect the ball from the resulting single source video data, iii) validation of the accuracy of the 2-dimensional (2D) CV data for motion change detection, and iv) the data requirements and processing techniques necessary to automatically detect changes in ball motion and match those to match-play events. Throughout the thesis, table tennis has been chosen as the example sport for observational and experimental analysis since it offers a number of specific CV challenges due to the relatively high ball speed (in excess of 100kph) and small ball size (40mm in diameter). Furthermore, the inherent rules of table tennis show potential for a monocular based event classification vision system. As the initial stage, a proposed optimum location and configuration of the single camera is defined. Next, the selection of a CV algorithm is critical in obtaining usable ball motion data. It is shown in this research that segmentation processes vary in their ball detection capabilities and location out-puts, which ultimately affects the ability of automated event detection and decision making solutions. Therefore, a comparison of CV algorithms is necessary to establish confidence in the accuracy of the derived location of the ball. As part of the research, a CV software environment has been developed to allow robust, repeatable and direct comparisons between different CV algorithms. An event based method of evaluating the success of a CV algorithm is proposed. Comparison of CV algorithms is made against the novel Efficacy Metric Set (EMS), producing a measurable Relative Efficacy Index (REI). Within the context of this low cost, single camera ball trajectory and event investigation, experimental results provided show that the Horn-Schunck Optical Flow algorithm, with a REI of 163.5 is the most successful method when compared to a discrete selection of CV detection and extraction techniques gathered from the literature review. Furthermore, evidence based data from the REI also suggests switching to the Canny edge detector (a REI of 186.4) for segmentation of the ball when in close proximity to the net. In addition to and in support of the data generated from the CV software environment, a novel method is presented for producing simultaneous data from 3D marker based recordings, reduced to 2D and compared directly to the CV output to establish comparative time-resolved data for the ball location. It is proposed here that a continuous scale factor, based on the known dimensions of the ball, is incorporated at every frame. Using this method, comparison results show a mean accuracy of 3.01mm when applied to a selection of nineteen video sequences and events. This tolerance is within 10% of the diameter of the ball and accountable by the limits of image resolution. Further experimental results demonstrate the ability to identify a number of match-play events from a monocular image sequence using a combination of the suggested optimum algorithm and ball motion analysis methods. The results show a promising application of 2D based CV processing to match-play event classification with an overall success rate of 95.9%. The majority of failures occur when the ball, during returns and services, is partially occluded by either the player or racket, due to the inherent problem of using a monocular recording device. Finally, the thesis proposes further research and extensions for developing and implementing monocular based CV processing of motion based event analysis and classification in a wider range of applications

    Acoustic event detection based on feature-level fusion of audio and video modalities

    Get PDF
    Research articleAcoustic event detection (AED) aims at determining the identity of sounds and their temporal position in audio signals. When applied to spontaneously generated acoustic events, AED based only on audio information shows a large amount of errors, which are mostly due to temporal overlaps. Actually, temporal overlaps accounted for more than 70% of errors in the realworld interactive seminar recordings used in CLEAR 2007 evaluations. In this paper, we improve the recognition rate of acoustic events using information from both audio and video modalities. First, the acoustic data are processed to obtain both a set of spectrotemporal features and the 3D localization coordinates of the sound source. Second, a number of features are extracted from video recordings by means of object detection, motion analysis, and multicamera person tracking to represent the visual counterpart of several acoustic events. A feature-level fusion strategy is used, and a parallel structure of binary HMM-based detectors is employed in our work. The experimental results show that information from both the microphone array and video cameras is useful to improve the detection rate of isolated as well as spontaneously generated acoustic events.Peer ReviewedPostprint (published version

    Moving object detection via TV-L1 optical flow in fall-down videos

    Get PDF
    There is a growing demand for surveillance systems that can detect fall-down events because of the increased number of surveillance cameras being installed in many public indoor and outdoor locations. Fall-down event detection has been vigorously and extensively researched for safety purposes, particularly to monitor elderly peoples, patients, and toddlers. This computer vision detector has become more affordable with the development of high-speed computer networks and low-cost video cameras. This paper proposes moving object detection method based on human motion analysis for human fall-down events. The method comprises of three parts, which are preprocessing part to reduce image noises, motion detection part by using TV-L1 optical flow algorithm, and performance measure part. The last part will analyze the results of the object detection part in term of the bounding boxes, which are compared with the given ground truth. The proposed method is tested on Fall Down Detection (FDD) dataset and compared with Gunnar-Farneback optical flow by measuring intersection over union (IoU) of the output with respect to the ground truth bounding box. The experimental results show that the proposed method achieves an average IoU of 0.92524

    Semantic analysis of field sports video using a petri-net of audio-visual concepts

    Get PDF
    The most common approach to automatic summarisation and highlight detection in sports video is to train an automatic classifier to detect semantic highlights based on occurrences of low-level features such as action replays, excited commentators or changes in a scoreboard. We propose an alternative approach based on the detection of perception concepts (PCs) and the construction of Petri-Nets which can be used for both semantic description and event detection within sports videos. Low-level algorithms for the detection of perception concepts using visual, aural and motion characteristics are proposed, and a series of Petri-Nets composed of perception concepts is formally defined to describe video content. We call this a Perception Concept Network-Petri Net (PCN-PN) model. Using PCN-PNs, personalized high-level semantic descriptions of video highlights can be facilitated and queries on high-level semantics can be achieved. A particular strength of this framework is that we can easily build semantic detectors based on PCN-PNs to search within sports videos and locate interesting events. Experimental results based on recorded sports video data across three types of sports games (soccer, basketball and rugby), and each from multiple broadcasters, are used to illustrate the potential of this framework

    Indexing of fictional video content for event detection and summarisation

    Get PDF
    This paper presents an approach to movie video indexing that utilises audiovisual analysis to detect important and meaningful temporal video segments, that we term events. We consider three event classes, corresponding to dialogues, action sequences, and montages, where the latter also includes musical sequences. These three event classes are intuitive for a viewer to understand and recognise whilst accounting for over 90% of the content of most movies. To detect events we leverage traditional filmmaking principles and map these to a set of computable low-level audiovisual features. Finite state machines (FSMs) are used to detect when temporal sequences of specific features occur. A set of heuristics, again inspired by filmmaking conventions, are then applied to the output of multiple FSMs to detect the required events. A movie search system, named MovieBrowser, built upon this approach is also described. The overall approach is evaluated against a ground truth of over twenty-three hours of movie content drawn from various genres and consistently obtains high precision and recall for all event classes. A user experiment designed to evaluate the usefulness of an event-based structure for both searching and browsing movie archives is also described and the results indicate the usefulness of the proposed approach

    Event detection in field sports video using audio-visual features and a support vector machine

    Get PDF
    In this paper, we propose a novel audio-visual feature-based framework for event detection in broadcast video of multiple different field sports. Features indicating significant events are selected and robust detectors built. These features are rooted in characteristics common to all genres of field sports. The evidence gathered by the feature detectors is combined by means of a support vector machine, which infers the occurrence of an event based on a model generated during a training phase. The system is tested generically across multiple genres of field sports including soccer, rugby, hockey, and Gaelic football and the results suggest that high event retrieval and content rejection statistics are achievable

    Indexing of fictional video content for event detection and summarisation

    Get PDF
    This paper presents an approach to movie video indexing that utilises audiovisual analysis to detect important and meaningful temporal video segments, that we term events. We consider three event classes, corresponding to dialogues, action sequences, and montages, where the latter also includes musical sequences. These three event classes are intuitive for a viewer to understand and recognise whilst accounting for over 90% of the content of most movies. To detect events we leverage traditional filmmaking principles and map these to a set of computable low-level audiovisual features. Finite state machines (FSMs) are used to detect when temporal sequences of specific features occur. A set of heuristics, again inspired by filmmaking conventions, are then applied to the output of multiple FSMs to detect the required events. A movie search system, named MovieBrowser, built upon this approach is also described. The overall approach is evaluated against a ground truth of over twenty-three hours of movie content drawn from various genres and consistently obtains high precision and recall for all event classes. A user experiment designed to evaluate the usefulness of an event-based structure for both searching and browsing movie archives is also described and the results indicate the usefulness of the proposed approach
    corecore