3 research outputs found

    Anomaly Detection, Rule Adaptation and Rule Induction Methodologies in the Context of Automated Sports Video Annotation.

    Get PDF
    Automated video annotation is a topic of considerable interest in computer vision due to its applications in video search, object based video encoding and enhanced broadcast content. The domain of sport broadcasting is, in particular, the subject of current research attention due to its fixed, rule governed, content. This research work aims to develop, analyze and demonstrate novel methodologies that can be useful in the context of adaptive and automated video annotation systems. In this thesis, we present methodologies for addressing the problems of anomaly detection, rule adaptation and rule induction for court based sports such as tennis and badminton. We first introduce an HMM induction strategy for a court-model based method that uses the court structure in the form of a lattice for two related modalities of singles and doubles tennis to tackle the problems of anomaly detection and rectification. We also introduce another anomaly detection methodology that is based on the disparity between the low-level vision based classifiers and the high-level contextual classifier. Another approach to address the problem of rule adaptation is also proposed that employs Convex hulling of the anomalous states. We also investigate a number of novel hierarchical HMM generating methods for stochastic induction of game rules. These methodologies include, Cartesian product Label-based Hierarchical Bottom-up Clustering (CLHBC) that employs prior information within the label structures. A new constrained variant of the classical Chinese Restaurant Process (CRP) is also introduced that is relevant to sports games. We also propose two hybrid methodologies in this context and a comparative analysis is made against the flat Markov model. We also show that these methods are also generalizable to other rule based environments

    Cognitive vision systems for video understanding and retrieval.

    No full text
    This thesis addresses the problem of creating computer vision systems that will facilitate high-level, user-friendly interpretation of an observed scene, and which will be readily adaptable to a wide range of computer vision tasks. Hence, the notion of injecting cognitive capabilities to traditional computer vision systems is central to this work. Initially, the requirements of creating a cognitive vision system will be examined. This will lead us to the conclusion that the two main enabling components for such systems are the following: a unified framework for reasoning in the context of the observed scene; and a multi-layered memory architecture that will aid the reasoning framework in recalling and storing all relevant information about the observed scene. Regarding the apparatus used for reasoning in video sequences, it will be argued that it must be characterised by its ability to be applied at all levels of information processing (from raw input data to high-level abstractions concerning the evolution of the observed scene), support and exploit any combination of spatial and temporal dependencies (i.e. context) present among the input data, and deliver good reasoning performance when applied at any categorical domain. On the other hand, the requirements the reasoning engine sets will be used as a guideline for the design of a memory architecture conducive to the former. Therefore, the latter must be able to handle arbitrary input data types, depending on the scope of the current cognition task. It must also allow for both forward and feedback interaction with the reasoning framework, as contextual information extracted from the observed scene at a later stage may assist the reasoning engine in altering a decision made in previous stages - just like humans do when presented with contradicting evidence. To further emulate the mechanisms that enable human cognition, forgetting processes were also embedded in the memory infrastructure. For this particular feature, different layers of memory storage facilitate forgetting at different speeds; the system forgets raw input and low-level feature data very quickly, whereas high-level concepts about the evolution of the observed scene are retained over relatively long term. Finally, the overall proposed system has been implemented and tested on a real-world application - the annotation of broadcast tennis video sequences. In this sample application, the goal was to create a cognitive vision system that would keep track of the score for the duration of the broadcast match, based on the main components described above. The results obtained from processing a set of sequences captured off-the-air indicate that the overall approach achieves far superior results to simply segmenting the video sequence into shots and analysing each one separately, taken out of the context of the match. This demonstrates that the ability to adapt by discovering and exploiting context is paramount to the efficiency of any future computer vision system, and is, in no small part, a feature that sets biological cognitive vision systems apart from their machine-based counterparts

    Cognitive vision systems for video understanding and retrieval

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore