Search CORE

834 research outputs found

Real-time event classification in field sport videos

Author: Kapela Rafal
Kolanowski Krzysztof
O'Connor Noel E.
Rybarczyk Andrzej
Świetlicka Aleksandra
Publication venue: 'Elsevier BV'
Publication date: 01/07/2015
Field of study

The paper presents a novel approach to real-time event detection in sports broadcasts. We present how the same underlying audio-visual feature extraction algorithm based on new global image descriptors is robust across a range of different sports alleviating the need to tailor it to a particular sport. In addition, we propose and evaluate three different classifiers in order to detect events using these features: a feed-forward neural network, an Elman neural network and a decision tree. Each are investigated and evaluated in terms of their usefulness for real-time event classification. We also propose a ground truth dataset together with an annotation technique for performance evaluation of each classifier useful to others interested in this problem

Crossref

Irish Universities

DCU Online Research Access Service

Dialogue scene detection in movies using low and mid-level visual features

Author: Lehane Bart
Murphy Noel
O'Connor Noel E.
Publication venue
Publication date: 01/10/2004
Field of study

This paper describes an approach for detecting dialogue scenes in movies. The approach uses automatically extracted low- and mid-level visual features that characterise the visual content of individual shots, and which are then combined using a state transition machine that models the shot-level temporal characteristics of the scene under investigation. The choice of visual features used is motivated by a consideration of formal film syntax. The system is designed so that the analysis may be applied in order to detect different types of scenes, although in this paper we focus on dialogue sequences as these are the most prevalent scenes in the movies considered to date

Irish Universities

DCU Online Research Access Service

CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

Author: Boujemaa Nozha
Compañó Ramón
Dosch Christoph
Geurts Joost
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Sebe Nicu
Publication venue: Chorus Project Consortium
Publication date: 01/01/2007
Field of study

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Multimedia Retrieval

Author
Publication venue: Springer
Publication date: 01/01/2007
Field of study

University of Twente Research Information

Personalized retrieval of sports video

Author
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2007
Field of study

Crossref

BCS SGAI SMA 2013: the BCS SGAI workshop on social media analysis

Author
Publication venue: M. Jeusfeld
Publication date: 01/01/2013
Field of study

Portsmouth University Research Portal (Pure)

Enhancing fan experience during live sports broadcasts through second screen applications

Author: Centieiro Pedro Miguel da Fonseca
Publication venue
Publication date: 01/03/2015
Field of study

When sports fans attend live sports events, they usually engage in social experiences with friends, family members and other fans at the venue sharing the same affiliation. However, fans watching the same event through a live television broadcast end up not feeling so emotionally connected with the athletes and other fans as they would if they were watching it live, together with thousands of other fans. With this in mind, we seek to create mobile applications that deliver engaging social experiences involving remote fans watching live broadcasted sports events. Taking into account the growing use of mobile devices when watching TV broadcasts, these mobile applications explore the second screen concept, which allows users to interact with content that complements the TV broadcast. Within this context, we present a set of second screen application prototypes developed to test our concepts, the corresponding user studies and results, as well as suggestions on how to apply the prototypes’ concepts not only in different sports, but also during TV shows and electronic sports. Finally, we also present the challenges we faced and the guidelines we followed during the development and evaluation phases, which may give a considerable contribution to the development of future second screen applications for live broadcasted events

Repositório da Universidade Nova de Lisboa

Integrated analysis of audiovisual signals and external information sources for event detection in team sports video

Author: XU HUAXIN
Publication venue
Publication date: 28/04/2008
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Grounding language in events

Author: Fleischman Michael Ben
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2008
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 137-142).Broadcast video and virtual environments are just two of the growing number of domains in which language is embedded in multiple modalities of rich non-linguistic information. Applications for such multimodal domains are often based on traditional natural language processing techniques that ignore the connection between words and the non-linguistic context in which they are used. This thesis describes a methodology for representing these connections in models which ground the meaning of words in representations of events. Incorporating these grounded language models with text-based techniques significantly improves the performance of three multimodal applications: natural language understanding in videogames, sports video search and automatic speech recognition. Two approaches to representing the structure of events are presented and used to model the meaning of words. In the domain of virtual game worlds, a hand-designed hierarchical behavior grammar is used to explicitly represent all the various actions that an agent can take in a virtual world. This grammar is used to interpret events by parsing sequences of observed actions in order to generate hierarchical event structures. In the noisier and more open -ended domain of broadcast sports video, hierarchical temporal patterns are automatically mined from large corpora of unlabeled video data. The structure of events in video is represented by vectors of these hierarchical patterns.(cont.) Grounded language models are encoded using Hierarchical Bayesian models to represent the probability of words given elements of these event structures. These grounded language models are used to incorporate non-linguistic information into text-based approaches to multimodal applications. In the virtual game domain, this non-linguistic information improves natural language understanding for a virtual agent by nearly 10% and cuts in half the negative effects of noise caused by automatic speech recognition. For broadcast video of baseball and American football, video search systems that incorporate grounded language models are shown to perform up to 33% better than text-based systems. Further, systems for recognizing speech in baseball video that use grounded language models show 25% greater word accuracy than traditional systems.by Michael Ben Fleischman.Ph.D

CiteSeerX

DSpace@MIT