Search CORE

3,141 research outputs found

Audio-visual football video analysis, from structure detection to attention analysis

Author: Ren Reede
Publication venue
Publication date: 01/01/2008
Field of study

Sport video is an important video genre. Content-based sports video analysis attracts great interest from both industry and academic ﬁelds. A sports video is characterised by repetitive temporal structures, relatively plain contents, and strong spatio-temporal variations, such as quick camera switches and swift local motions. It is necessary to develop speciﬁc techniques for content-based sports video analysis to utilise these characteristics. For an efﬁcient and effective sports video analysis system, there are three fundamental questions: (1) what are key stories for sports videos; (2) what incurs viewer’s interest; and (3) how to identify game highlights. This thesis is developed around these questions. We approached these questions from two different perspectives and in turn three research contributions are presented, namely, replay detection, attack temporal structure decomposition, and attention-based highlight identiﬁcation. Replay segments convey the most important contents in sports videos. It is an efﬁcient approach to collect game highlights by detecting replay segments. However, replay is an artefact of editing, which improves with advances in video editing tools. The composition of replay is complex, which includes logo transitions, slow motions, viewpoint switches and normal speed video clips. Since logo transition clips are pervasive in game collections of FIFA World Cup 2002, FIFA World Cup 2006 and UEFA Championship 2006, we take logo transition detection as an effective replacement of replay detection. A two-pass system was developed, including a ﬁve-layer adaboost classiﬁer and a logo template matching throughout an entire video. The ﬁve-layer adaboost utilises shot duration, average game pitch ratio, average motion, sequential colour histogram and shot frequency between two neighbouring logo transitions, to ﬁlter out logo transition candidates. Subsequently, a logo template is constructed and employed to ﬁnd all transition logo sequences. The precision and recall of this system in replay detection is 100% in a ﬁve-game evaluation collection. An attack structure is a team competition for a score. Hence, this structure is a conceptually fundamental unit of a football video as well as other sports videos. We review the literature of content-based temporal structures, such as play-break structure, and develop a three-step system for automatic attack structure decomposition. Four content-based shot classes, namely, play, focus, replay and break were identiﬁed by low level visual features. A four-state hidden Markov model was trained to simulate transition processes among these shot classes. Since attack structures are the longest repetitive temporal unit in a sports video, a sufﬁx tree is proposed to ﬁnd the longest repetitive substring in the label sequence of shot class transitions. These occurrences of this substring are regarded as a kernel of an attack hidden Markov process. Therefore, the decomposition of attack structure becomes a boundary likelihood comparison between two Markov chains. Highlights are what attract notice. Attention is a psychological measurement of “notice ”. A brief survey of attention psychological background, attention estimation from vision and auditory, and multiple modality attention fusion is presented. We propose two attention models for sports video analysis, namely, the role-based attention model and the multiresolution autoregressive framework. The role-based attention model is based on the perception structure during watching video. This model removes reﬂection bias among modality salient signals and combines these signals by reﬂectors. The multiresolution autoregressive framework (MAR) treats salient signals as a group of smooth random processes, which follow a similar trend but are ﬁlled with noise. This framework tries to estimate a noise-less signal from these coarse noisy observations by a multiple resolution analysis. Related algorithms are developed, such as event segmentation on a MAR tree and real time event detection. The experiment shows that these attention-based approach can ﬁnd goal events at a high precision. Moreover, results of MAR-based highlight detection on the ﬁnal game of FIFA 2002 and 2006 are highly similar to professionally labelled highlights by BBC and FIFA

Glasgow Theses Service

CiteSeerX

OpenGrey Repository

CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

Author: Boujemaa Nozha
Compañó Ramón
Dosch Christoph
Geurts Joost
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Sebe Nicu
Publication venue: Chorus Project Consortium
Publication date: 01/01/2007
Field of study

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Classification of Team Behaviors in Sports Video Games

Author
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Crossref

Highly efficient low-level feature extraction for video representation and retrieval.

Author: Calie Janko
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2004
Field of study

PhDWitnessing the omnipresence of digital video media, the research community has raised the question of its meaningful use and management. Stored in immense multimedia databases, digital videos need to be retrieved and structured in an intelligent way, relying on the content and the rich semantics involved. Current Content Based Video Indexing and Retrieval systems face the problem of the semantic gap between the simplicity of the available visual features and the richness of user semantics. This work focuses on the issues of efficiency and scalability in video indexing and retrieval to facilitate a video representation model capable of semantic annotation. A highly efficient algorithm for temporal analysis and key-frame extraction is developed. It is based on the prediction information extracted directly from the compressed domain features and the robust scalable analysis in the temporal domain. Furthermore, a hierarchical quantisation of the colour features in the descriptor space is presented. Derived from the extracted set of low-level features, a video representation model that enables semantic annotation and contextual genre classification is designed. Results demonstrate the efficiency and robustness of the temporal analysis algorithm that runs in real time maintaining the high precision and recall of the detection task. Adaptive key-frame extraction and summarisation achieve a good overview of the visual content, while the colour quantisation algorithm efficiently creates hierarchical set of descriptors. Finally, the video representation model, supported by the genre classification algorithm, achieves excellent results in an automatic annotation system by linking the video clips with a limited lexicon of related keywords

Queen Mary Research Online

OpenGrey Repository

Personalised video retrieval: application of implicit feedback and semantic user profiles

Author: Hopfgartner Frank
Publication venue
Publication date: 01/01/2010
Field of study

A challenging problem in the user profiling domain is to create profiles of users of retrieval systems. This problem even exacerbates in the multimedia domain. Due to the Semantic Gap, the difference between low-level data representation of videos and the higher concepts users associate with videos, it is not trivial to understand the content of multimedia documents and to find other documents that the users might be interested in. A promising approach to ease this problem is to set multimedia documents into their semantic contexts. The semantic context can lead to a better understanding of the personal interests. Knowing the context of a video is useful for recommending users videos that match their information need. By exploiting these contexts, videos can also be linked to other, contextually related videos. From a user profiling point of view, these links can be of high value to recommend semantically related videos, hence creating a semantic-based user profile. This thesis introduces a semantic user profiling approach for news video retrieval, which exploits a generic ontology to put news stories into its context. Major challenges which inhibit the creation of such semantic user profiles are the identification of user's long-term interests and the adaptation of retrieval results based on these personal interests. Most personalisation services rely on users explicitly specifying preferences, a common approach in the text retrieval domain. By giving explicit feedback, users are forced to update their need, which can be problematic when their information need is vague. Furthermore, users tend not to provide enough feedback on which to base an adaptive retrieval algorithm. Deviating from the method of explicitly asking the user to rate the relevance of retrieval results, the use of implicit feedback techniques helps by learning user interests unobtrusively. The main advantage is that users are relieved from providing feedback. A disadvantage is that information gathered using implicit techniques is less accurate than information based on explicit feedback. In this thesis, we focus on three main research questions. First of all, we study whether implicit relevance feedback, which is provided while interacting with a video retrieval system, can be employed to bridge the Semantic Gap. We therefore first identify implicit indicators of relevance by analysing representative video retrieval interfaces. Studying whether these indicators can be exploited as implicit feedback within short retrieval sessions, we recommend video documents based on implicit actions performed by a community of users. Secondly, implicit relevance feedback is studied as potential source to build user profiles and hence to identify users' long-term interests in specific topics. This includes studying the identification of different aspects of interests and storing these interests in dynamic user profiles. Finally, we study how this feedback can be exploited to adapt retrieval results or to recommend related videos that match the users' interests. We analyse our research questions by performing both simulation-based and user-centred evaluation studies. The results suggest that implicit relevance feedback can be employed in the video domain and that semantic-based user profiles have the potential to improve video exploration

Glasgow Theses Service

Enlighten

OpenGrey Repository

Behaviour Profiling using Wearable Sensors for Pervasive Healthcare

Author: Ali Syed Muhammad Raza
Publication venue: Computing, Imperial College London
Publication date: 01/02/2013
Field of study

In recent years, sensor technology has advanced in terms of hardware sophistication and miniaturisation. This has led to the incorporation of unobtrusive, low-power sensors into networks centred on human participants, called Body Sensor Networks. Amongst the most important applications of these networks is their use in healthcare and healthy living. The technology has the possibility of decreasing burden on the healthcare systems by providing care at home, enabling early detection of symptoms, monitoring recovery remotely, and avoiding serious chronic illnesses by promoting healthy living through objective feedback. In this thesis, machine learning and data mining techniques are developed to estimate medically relevant parameters from a participant‘s activity and behaviour parameters, derived from simple, body-worn sensors. The first abstraction from raw sensor data is the recognition and analysis of activity. Machine learning analysis is applied to a study of activity profiling to detect impaired limb and torso mobility. One of the advances in this thesis to activity recognition research is in the application of machine learning to the analysis of 'transitional activities': transient activity that occurs as people change their activity. A framework is proposed for the detection and analysis of transitional activities. To demonstrate the utility of transition analysis, we apply the algorithms to a study of participants undergoing and recovering from surgery. We demonstrate that it is possible to see meaningful changes in the transitional activity as the participants recover. Assuming long-term monitoring, we expect a large historical database of activity to quickly accumulate. We develop algorithms to mine temporal associations to activity patterns. This gives an outline of the user‘s routine. Methods for visual and quantitative analysis of routine using this summary data structure are proposed and validated. The activity and routine mining methodologies developed for specialised sensors are adapted to a smartphone application, enabling large-scale use. Validation of the algorithms is performed using datasets collected in laboratory settings, and free living scenarios. Finally, future research directions and potential improvements to the techniques developed in this thesis are outlined

Spiral - Imperial College Digital Repository

Recommended from our members

User-centred car design and the role of feedback in driving

Author: Walker Guy Harrison
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2002
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.A survey of car manufacturers reveals an impressive list of upcoming technologies, the combined effect of which is likely to have a profound impact upon feedback to the driver. Feedback is information that the situation provides back to the driver and is specified with reference to content, source, and timing. Feedback quality is achieved when the information requirements of the task, derived from a new task analysis of driving, are matched to the sources, content, and timing of feedback provided by the environment and the vehicle. An exploratory on-road study begins by observing that better quality feedback is implicated in increasing driver's situational awareness (even though drivers have little self awareness of this fact), and optimising mental workload. The exploratory level of analysis builds into the experimental, whereby a highly controlled simulator study replicates and builds upon these findings. Feedback is again seen to positively influence situational awareness, where changes in driver's confidence ratings as to the presence or absence of feedback information in the simulation were observed, according to the modality of feedback presented. This was achieved with a probe recall paradigm, and using psychophysical techniques as a useful extension to the Situational awareness Global Assessment Technique (SAGAI). Similarly, an analysis of mental workload via the NASA TLX self report questionnaire demonstrates that a combination of visual, steering force feedback and auditory feedback gives rise to lower mental workload, lower driver frustration, and lower, though possibly more realistic self ratings of performance. This knowledge can be discussed with reference to a feedback framework of driving that provides the theoretical backdrop to the key psychological variables implicated in driving task performance. Overall, the findings contribute to knowledge in terms of new and imaginative ways of designing future vehicle technologies in order to maximise safety, efficiency, and enjoyment.This research is funded by the Hamilton Research Studentship

Brunel University Research Archive

Audio-coupled video content understanding of unconstrained video sequences

Author: Jose E.F.C. Lopes (7170161)
Publication venue
Publication date: 01/01/2011
Field of study

Unconstrained video understanding is a difficult task. The main aim of this thesis is to recognise the nature of objects, activities and environment in a given video clip using both audio and video information. Traditionally, audio and video information has not been applied together for solving such complex task, and for the first time we propose, develop, implement and test a new framework of multi-modal (audio and video) data analysis for context understanding and labelling of unconstrained videos. The framework relies on feature selection techniques and introduces a novel algorithm (PCFS) that is faster than the well-established SFFS algorithm. We use the framework for studying the benefits of combining audio and video information in a number of different problems. We begin by developing two independent content recognition modules. The first one is based on image sequence analysis alone, and uses a range of colour, shape, texture and statistical features from image regions with a trained classifier to recognise the identity of objects, activities and environment present. The second module uses audio information only, and recognises activities and environment. Both of these approaches are preceded by detailed pre-processing to ensure that correct video segments containing both audio and video content are present, and that the developed system can be made robust to changes in camera movement, illumination, random object behaviour etc. For both audio and video analysis, we use a hierarchical approach of multi-stage classification such that difficult classification tasks can be decomposed into simpler and smaller tasks. When combining both modalities, we compare fusion techniques at different levels of integration and propose a novel algorithm that combines advantages of both feature and decision-level fusion. The analysis is evaluated on a large amount of test data comprising unconstrained videos collected for this work. We finally, propose a decision correction algorithm which shows that further steps towards combining multi-modal classification information effectively with semantic knowledge generates the best possible results

Loughborough University Institutional Repository

Physiology and neuroanatomy of emotional reactivity in frontotemporal dementia

Author: Marshall Charles R
Publication venue: UCL (University College London)
Publication date: 28/07/2018
Field of study

ABSTRACT AND SUMMARY OF EXPERIMENTAL FINDINGS The frontotemporal dementias (FTD) are a heterogeneous group of neurodegenerative diseases that cause variable profiles of fronto-insulo-temporal network disintegration. Loss of empathy and dysfunctional social interaction are a leading features of FTD and major determinants of care burden, but remain poorly understood and difficult to measure with conventional neuropsychological instruments. Building on a large body of work in the healthy brain showing that embodied responses are important components of emotional responses and empathy, I performed a series of experiments to examine the extent to which the induction and decoding of somatic physiological responses to the emotions of others are degraded in FTD, and to define the underlying neuroanatomical changes responsible for these deficits. I systematically studied a range of modalities across the entire syndromic spectrum of FTD, including daily life emotional sensitivity, the cognitive categorisation of emotions, interoceptive accuracy, automatic facial mimicry, autonomic responses, and structural and functional neuroanatomy to deconstruct aberrant emotional reactivity in these diseases. My results provide proof of principle for the utility of physiological measures in deconstructing complex socioemotional symptoms and suggest that these warrant further investigation as clinical biomarkers in FTD. Chapter 3: Using a heartbeat counting task, I found that interoceptive accuracy is impaired in semantic variant primary progressive aphasia, but correlates with sensitivity to the emotions of others across FTD syndromes. Voxel based morphometry demonstrated that impaired interoceptive accuracy correlates with grey matter volume in anterior cingulate, insula and amygdala. Chapter 4: Using facial electromyography to index automatic imitation, I showed that mimicry of emotional facial expressions is impaired in the behavioural and right temporal variants of FTD. Automatic imitation predicted correct identification of facial emotions in healthy controls and syndromes focussed on the frontal lobes and insula, but not in syndromes focussed on the temporal lobes, suggesting that automatic imitation aids emotion recognition only when social concepts and semantic stores are intact. Voxel based morphometry replicated previously identified neuroanatomical correlates of emotion identification ability, while automatic imitation was associated with grey matter volume in a visuomotor network including primary visual and motor cortices, visual motion area (MT/V5) and supplementary motor cortex. Chapter 5: By recording heart rate during viewing of facial emotions, I showed that the normal cardiac reactivity to emotion is impaired in FTD syndromes with fronto-insular atrophy (behavioural variant FTD and nonfluent variant primary progressive aphasia) but not in syndromes focussed on the temporal lobes (right temporal variant FTD and semantic variant primary progressive aphasia). Unlike automatic imitation, cardiac reactivity dissociated from emotion identification ability. Voxel based morphometry revealed grey matter correlates of cardiac reactivity in anterior cingulate, insula and orbitofrontal cortex. Chapter 6: Subjects viewed videos of facial emotions during fMRI scanning, with concomitant recording of heart rate and pupil size. I identified syndromic profiles of reduced activity in posterior face responsive regions including posterior superior temporal sulcus and fusiform face area. Emotion identification ability was predicted by activity in more anterior areas including anterior cingulate, insula, inferior frontal gyrus and temporal pole. Autonomic reactivity related to activity in both components of the central autonomic control network and regions responsible for processing the sensory properties of the stimuli

UCL Discovery

Recommended from our members

MC2: MPEG-7 content modelling communities

Author: Daylamani Zad Damon
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2013
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel UniversityThe use of multimedia content on the web has grown significantly in recent years. Websites such as Facebook, YouTube and Flickr cater for enormous amounts of multimedia content uploaded by users. This vast amount of multimedia content requires comprehensive content modelling otherwise retrieving relevant content will be challenging. Modelling multimedia content can be an extremely time consuming task that may seem impossible particularly when undertaken by individual users. However, the advent of Web 2.0 and associated communities, such as YouTube and Flickr, has shown that users appear to be more willing to collaborate in order to take on enormous tasks such as multimedia content modelling. Harnessing the power of communities to achieve comprehensive content modelling is the primary focus of this research. The aim of this thesis is to explore collaborative multimedia content modelling and in particular the effectiveness of existing multimedia content modelling tools, taking into account the key development challenges of existing collaborative content modelling research and the associated modelling tools. Four research objectives are pursued in order to achieve this; first, design a user experiment to study users’ tagging behaviour with existing multimedia tagging tools and identify any relationships between such user behaviour; second, design and develop a framework for MPEG-7 content modelling communities based on the results of the experiment; third, implement an online service as a proof of concept of the framework; fourth, validate the framework through the online service during a repeat of the initial user experiment. This research contributes first, a conceptual model of user behaviour visualised as a fuzzy cognitive map and, second, an MPEG-7 framework for multimedia content modelling communities (MC2) and its proof of concept as an online service. The fuzzy cognitive model embodies relationships between user tagging behaviour and context and provides an understanding of user priorities in the description of content features and the relationships that exist between them. The MC2 framework, developed based on the fuzzy cognitive model, is deep-rooted in user content modelling behaviour and content preferences. A proof of concept of the MC2 framework is implemented as an online service in which all metadata is modelled using MPEG-7. The online service is validated, first, empirically with the same group of users and through the same experiment that led to the development of the fuzzy cognitive model and, second, functionally against the folksonomy and MPEG-7 content modelling tools used in the initial experiment. The validation demonstrates that MC2 has the advantages without the shortcomings of existing multimedia tagging tools by harnessing the ease of use of folksonomy tools while producing comprehensive structured metadata.Supported by UK Engineering and Physical Sciences Research Council (EPSRC

Brunel University Research Archive