    Automatic TV advertisement detection from MPEG bitstream

    The Centre for Digital Video Processing at Dublin City University conducts concentrated research and development in the area of digital video management. The current stage of development is demonstrated on our Web-based digital video system called Físchlár (Proceedings of the Content based Multimedia Information Access, RIAO 2000, Vol. 2, Paris, France, 12–14 April 2000, p. 1390), which provides for efficient recording, analysing, browsing and viewing of digitally captured television programmes. Advertisement breaks during or between television programmes are typically recognised by a series of ‘black’ video frames simultaneously accompanying a depression in audio volume which separate each advertisement from one another by recurrently occurring before and after each individual advertisement. It is the regular prevalence of these flags that enables automatic differentiation between what is programme and what is a commercial break. This paper reports on the progress made in the development of this idea into an advertisement detector system that automatically detects the commercial breaks from the bitstream of digitally captured television broadcasts

    MPEG-1 bitstreams processing for audio content analysis

    In this paper, we present the MPEG-1 Audio bitstreams processing work which our research group is involved in. This work is primarily based on the processing of the encoded bitstream, and the extraction of useful audio features for the purposes of analysis and browsing. In order to prepare for the discussion of these features, the MPEG-1 audio bitstream format is first described. The Application Interface Protocol (API) which we have been developing in C++ is then introduced, before completing the paper with a discussion on audio feature extraction

    Speech-music discrimination from MPEG-1 bitstream

    This paper describes a proposed algorithm for speech/music discrimination, which works on data directly taken from MPEG encoded bitstream thus avoiding the computationally difficult decoding-encoding process. The method is based on thresholding of features derived from the modulation envelope of the frequency-limited audio signal. The discriminator is tested on more than 2 hours of audio data, which contain clean and noisy speech from several speakers and a variety of music content. The discriminator is able to work in real time and despite its simplicity, results are very promising

    Research in information managment at Dublin City University

    The Information Management Group at Dublin City University has research themes such as digital multimedia, interoperable systems and database engineering. In the area of digital multimedia, a collaboration with our School of Electronic Engineering has formed the Centre for Digital Video Processing, a university designated research centre whose aim is to research, develop and evaluate content-based operations on digital video information. To achieve this goal, the range of expertise in this centre covers the complete gamut from image analysis and feature extraction through to video search engine technology and interfaces to video browsing. The Interoperable Systems Group has research interests in federated databases and interoperability, object modelling and database engineering. This report describes the research activities of the major groupings within the Information Management community in Dublin City University

    Audio/visual analysis for high-speed TV advertisement detection from MPEG bitstream

    Advertisement breaks dunng or between television programmes are typically flagged by senes of black-and-silent video frames, which recurrendy occur in order to audio-visually separate individual advertisement spots from one another. It is the regular prevalence of these flags that enables automatic differentiauon between what is programme content and what is advertisement break. Detection of these audio-visual depressions within broadcast television content provides a basis on which advertisement detection may be achieved. This document reports on the progress made in the development of this idea into an advertisement detector system that automatically detects the advertisement breaks direcdy from the MPEG-1 encoded bitstream of digitally captured television broadcasts

    Deteksi Iklan Televisi Dengan Metode Pencocokan Warna RGB Dan Kontur Logo Stasiun Televisi

    Iklan televisi merupakan bagian yang menarik dalam berbagai hal[6]. Ahli periklanan dan psikologi tertarik karena pengaruhnya terhadap kebiasaan manusia dalam membeli, sedangkan orang tua lebih tertarik untuk menghindarkan anaknya dari pengaruh iklan televisi[6]. Metode deteksi iklan televisi yang dilakukan pada umumnya tidak dilakukan secara real time[2]. Pada Tugas Akhir ini deteksi iklan dilakukan secara real time dengan menggunakan dua buah metode untuk mendeteksi iklan televisi. Metode yang digunakan yaitu dengan pencocokan warna RGB dan kontur logo stasiun televisi. Ujicoba dilakukan pada 27.255 frame tayangan MetroTV dan 20.891 frame tayangan TV7 selama 7 jam. Hasil ujicoba pada MetroTV menunjukkan nilai threshold yang lebih akurat untuk pencocokan warna RGB sebesar 200 dan untuk pencocokan kontur sebesar 50, tingkat kebenaran dalam mengklasifikasikan acara dan iklan (Correct Classification) sebesar 98.28%, kesalahan mendeteksi iklan (False Reject) sebesar 0.82% dan kesalahan mendeteksi acara (False Acceptance) sebesar 0.90%. Hasil ujicoba pada TV7 menunjukkan nilai threshold yang lebih akurat untuk pencocokan warna RGB sebesar 90 dan untuk pencocokan kontur sebesar 30, tingkat kebenaran mengklasifikasikan acara dan iklan (Correct Classification) sebesar 98.37%, kesalahan mendeteksi iklan (False Reject) sebesar 1.03% dan kesalahan mendeteksi acara (False Acceptance) sebesar 0.60%

    An investigation into weighted data fusion for content-based multimedia information retrieval

    Content Based Multimedia Information Retrieval (CBMIR) is characterised by the combination of noisy sources of information which, in unison, are able to achieve strong performance. In this thesis we focus on the combination of ranked results from the independent retrieval experts which comprise a CBMIR system through linearly weighted data fusion. The independent retrieval experts are low-level multimedia features, each of which contains an indexing function and ranking algorithm. This thesis is comprised of two halves. In the first half, we perform a rigorous empirical investigation into the factors which impact upon performance in linearly weighted data fusion. In the second half, we leverage these finding to create a new class of weight generation algorithms for data fusion which are capable of determining weights at query-time, such that the weights are topic dependent

    Audiovisual processing for sports-video summarisation technology

    In this thesis a novel audiovisual feature-based scheme is proposed for the automatic summarization of sports-video content The scope of operability of the scheme is designed to encompass the wide variety o f sports genres that come under the description ‘field-sports’. Given the assumption that, in terms of conveying the narrative of a field-sports-video, score-update events constitute the most significant moments, it is proposed that their detection should thus yield a favourable summarisation solution. To this end, a generic methodology is proposed for the automatic identification of score-update events in field-sports-video content. The scheme is based on the development of robust extractors for a set of critical features, which are shown to reliably indicate their locations. The evidence gathered by the feature extractors is combined and analysed using a Support Vector Machine (SVM), which performs the event detection process. An SVM is chosen on the basis that its underlying technology represents an implementation of the latest generation of machine learning algorithms, based on the recent advances in statistical learning. Effectively, an SVM offers a solution to optimising the classification performance of a decision hypothesis, inferred from a given set of training data. Via a learning phase that utilizes a 90-hour field-sports-video trainmg-corpus, the SVM infers a score-update event model by observing patterns in the extracted feature evidence. Using a similar but distinct 90-hour evaluation corpus, the effectiveness of this model is then tested genencally across multiple genres of fieldsports- video including soccer, rugby, field hockey, hurling, and Gaelic football. The results suggest that in terms o f the summarization task, both high event retrieval and content rejection statistics are achievable

    Feedback-Based Gameplay Metrics and Gameplay Performance Segmentation: An audio-visual approach for assessing player experience.

    Gameplay metrics is a method and approach that is growing in popularity amongst the game studies research community for its capacity to assess players’ engagement with game systems. Yet, little has been done, to date, to quantify players’ responses to feedback employed by games that conveys information to players, i.e., their audio-visual streams. The present thesis introduces a novel approach to player experience assessment - termed feedback-based gameplay metrics - which seeks to gather gameplay metrics from the audio-visual feedback streams presented to the player during play. So far, gameplay metrics - quantitative data about a game state and the player's interaction with the game system - are directly logged via the game's source code. The need to utilise source code restricts the range of games that researchers can analyse. By using computer science algorithms for audio-visual processing, yet to be employed for processing gameplay footage, the present thesis seeks to extract similar metrics through the audio-visual streams, thus circumventing the need for access to, whilst also proposing a method that focuses on describing the way gameplay information is broadcast to the player during play. In order to operationalise feedback-based gameplay metrics, the present thesis introduces the concept of gameplay performance segmentation which describes how coherent segments of play can be identified and extracted from lengthy game play sessions. Moreover, in order to both contextualise the method for processing metrics and provide a conceptual framework for analysing the results of a feedback-based gameplay metric segmentation, a multi-layered architecture based on five gameplay concepts (system, game world instance, spatial-temporal, degree of freedom and interaction) is also introduced. Finally, based on data gathered from game play sessions with participants, the present thesis discusses the validity of feedback-based gameplay metrics, gameplay performance segmentation and the multi-layered architecture. A software system has also been specifically developed to produce gameplay summaries based on feedback-based gameplay metrics, and examples of summaries (based on several games) are presented and analysed. The present thesis also demonstrates that feedback-based gameplay metrics can be conjointly analysed with other forms of data (such as biometry) in order to build a more complete picture of game play experience. Feedback based game-play metrics constitutes a post-processing approach that allows the researcher or analyst to explore the data however they wish and as many times as they wish. The method is also able to process any audio-visual file, and can therefore process material from a range of audio-visual sources. This novel methodology brings together game studies and computer sciences by extending the range of games that can now be researched but also to provide a viable solution accounting for the exact way players experience games

    Segmentation et classification dans les images de documents numérisés

    Les travaux de cette thèse ont été effectués dans le cadre de l'analyse et du traitement d'images de documents imprimés afin d'automatiser la création de revues de presse. Les images en sortie du scanner sont traitées sans aucune information a priori ou intervention humaine. Ainsi, pour les caractériser, nous présentons un système d'analyse de documents composites couleur qui réalise une segmentation en zones colorimétriquement homogènes et qui adapte les algorithmes d'extraction de textes aux caractéristiques locales de chaque zone. Les informations colorimétriques et textuelles fournies par ce système alimentent une méthode de segmentation physique des pages de presse numérisée. Les blocs issus de cette décomposition font l'objet d'une classification permettant, entre autres, de détecter les zones publicitaires. Dans la continuité et l'expansion des travaux de classification effectués dans la première partie, nous présentons un nouveau moteur de classification et de classement générique, rapide et facile à utiliser. Cette approche se distingue de la grande majorité des méthodes existantes qui reposent sur des connaissances a priori sur les données et dépendent de paramètres abstraits et difficiles à déterminer par l'utilisateur. De la caractérisation colorimétrique au suivi des articles en passant par la détection des publicités, l'ensemble des approches présentées ont été combinées afin de mettre au point une application permettant la classification des documents de presse numérisée par le contenu.In this thesis, we deal with printed document images processing and analysis to automate the press reviews. The scanner output images are processed without any prior knowledge nor human intervention. Thus, to characterize them, we present a scalable analysis system for complex documents. This characterization is based on a hybrid color segmentation suited to noisy document images. The color analysis customizes text extraction algorithms to fit the local image properties. The provided color and text information is used to perform layout segmentation in press images and to compute features on the resulting blocks. These elements are classified to detect advertisements. In the second part of this thesis, we deal with a more general purpose: clusternig and classification. We present a new clustering approach, named ACPP, which is completely automated, fast and easy to use. This approach's main features are its independence of prior knowledge about the data and theoretical parameters that should be determined by the user. Color analysis, layout segmentation and the ACPP classification method are combined to create a complete processing chain for press images.VILLEURBANNE-DOC'INSA LYON (692662301) / SudocVILLEURBANNE-DOC'INSA-Bib. elec. (692669901) / SudocSudocFranceF