Search CORE

597 research outputs found

COSMOS-7: Video-oriented MPEG-7 scheme for modelling and filtering of semantic content

Author: Agius HW
Angelides MC
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2005
Field of study

MPEG-7 prescribes a format for semantic content models for multimedia to ensure interoperability across a multitude of platforms and application domains. However, the standard leaves it open as to how the models should be used and how their content should be filtered. Filtering is a technique used to retrieve only content relevant to user requirements, thereby reducing the necessary content-sifting effort of the user. This paper proposes an MPEG-7 scheme that can be deployed for semantic content modelling and filtering of digital video. The proposed scheme, COSMOS-7, produces rich and multi-faceted semantic content models and supports a content-based filtering approach that only analyses content relating directly to the preferred content requirements of the user

CiteSeerX

Brunel University Research Archive

Video summarisation: A conceptual framework and survey of the state of the art

Author: Arthur G. Money
Babaguchi
Boyatzis
Cernekova
Chang
Chang
Crockford
Dey
Dimitrova
Ekin
Ferman
Gianluigi
Hanjalic
Hanjalic
Harry Agius
Joffe
Kim
Lee
Lew
Li
Li
Lienhart
Ma
Moriyama
Ngo
Otsuka
Shih
Silverman
Taylor
Tjondronegoro
Tseng
Wang
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/02/2008
Field of study

This is the post-print (final draft post-refereeing) version of the article. Copyright @ 2007 Elsevier Inc.Video summaries provide condensed and succinct representations of the content of a video stream through a combination of still images, video segments, graphical representations and textual descriptors. This paper presents a conceptual framework for video summarisation derived from the research literature and used as a means for surveying the research literature. The framework distinguishes between video summarisation techniques (the methods used to process content from a source video stream to achieve a summarisation of that stream) and video summaries (outputs of video summarisation techniques). Video summarisation techniques are considered within three broad categories: internal (analyse information sourced directly from the video stream), external (analyse information not sourced directly from the video stream) and hybrid (analyse a combination of internal and external information). Video summaries are considered as a function of the type of content they are derived from (object, event, perception or feature based) and the functionality offered to the user for their consumption (interactive or static, personalised or generic). It is argued that video summarisation would benefit from greater incorporation of external information, particularly user based information that is unobtrusively sourced, in order to overcome longstanding challenges such as the semantic gap and providing video summaries that have greater relevance to individual users

Crossref

Brunel University Research Archive

Temporal Cross-Media Retrieval with Soft-Smoothing

Author: Andrew Galen
Benevenuto Fabricio
Blei David M.
He Kaiming
Herbrich Ralf
Hu D.
Ngiam Jiquan
Srivastava Nitish
Srivastava Nitish
Uricchio Tiberio
Wang L.
Yan F.
Zhan M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 10/10/2018
Field of study

Multimedia information have strong temporal correlations that shape the way modalities co-occur over time. In this paper we study the dynamic nature of multimedia and social-media information, where the temporal dimension emerges as a strong source of evidence for learning the temporal correlations across visual and textual modalities. So far, cross-media retrieval models, explored the correlations between different modalities (e.g. text and image) to learn a common subspace, in which semantically similar instances lie in the same neighbourhood. Building on such knowledge, we propose a novel temporal cross-media neural architecture, that departs from standard cross-media methods, by explicitly accounting for the temporal dimension through temporal subspace learning. The model is softly-constrained with temporal and inter-modality constraints that guide the new subspace learning task by favouring temporal correlations between semantically similar and temporally close instances. Experiments on three distinct datasets show that accounting for time turns out to be important for cross-media retrieval. Namely, the proposed method outperforms a set of baselines on the task of temporal cross-media retrieval, demonstrating its effectiveness for performing temporal subspace learning.Comment: To appear in ACM MM 201

arXiv.org e-Print Archive

Crossref

Toward a model of computational attention based on expressive behavior: applications to cultural heritage scenarios

Author: Glowinski Donald
Maes Pieter-Jan
Mancas Matei
Volpe Gualtiero
Publication venue
Publication date: 01/01/2009
Field of study

Our project goals consisted in the development of attention-based analysis of human expressive behavior and the implementation of real-time algorithm in EyesWeb XMI in order to improve naturalness of human-computer interaction and context-based monitoring of human behavior. To this aim, perceptual-model that mimic human attentional processes was developed for expressivity analysis and modeled by entropy. Museum scenarios were selected as an ecological test-bed to elaborate three experiments that focus on visitor profiling and visitors flow regulation

Ghent University Academic Bibliography

ELVIS: Entertainment-led video summaries

Author: Arthur G. Money
Babaguchi N.
Cacioppo J. T.
Damnjanovic U.
Furini M.
Greenwald M. K.
Harry Agius
Jaimes A.
Kim J.
Leonhardt S.
Millet C.
Money A. G.
Nasoz F.
Rikkard N. S.
Sebe N.
Shipman S.
Takahashi Y.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2010
Field of study

© ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in ACM Transactions on Multimedia Computing, Communications, and Applications, 6(3): Article no. 17 (2010) http://doi.acm.org/10.1145/1823746.1823751Video summaries present the user with a condensed and succinct representation of the content of a video stream. Usually this is achieved by attaching degrees of importance to low-level image, audio and text features. However, video content elicits strong and measurable physiological responses in the user, which are potentially rich indicators of what video content is memorable to or emotionally engaging for an individual user. This article proposes a technique that exploits such physiological responses to a given video stream by a given user to produce Entertainment-Led VIdeo Summaries (ELVIS). ELVIS is made up of five analysis phases which correspond to the analyses of five physiological response measures: electro-dermal response (EDR), heart rate (HR), blood volume pulse (BVP), respiration rate (RR), and respiration amplitude (RA). Through these analyses, the temporal locations of the most entertaining video subsegments, as they occur within the video stream as a whole, are automatically identified. The effectiveness of the ELVIS technique is verified through a statistical analysis of data collected during a set of user trials. Our results show that ELVIS is more consistent than RANDOM, EDR, HR, BVP, RR and RA selections in identifying the most entertaining video subsegments for content in the comedy, horror/comedy, and horror genres. Subjective user reports also reveal that ELVIS video summaries are comparatively easy to understand, enjoyable, and informative

Crossref

Brunel University Research Archive

Predictive biometrics: A review and analysis of predicting personal characteristics from biometric data

Author: Abreu M.C.C.
Abreu M.C.C.
Abreu M.C.D.C.
Chang T.‐Y.
Chen Y.‐L.
Dobry G.
Gao Y.
Geng X.
Giot R.
Idrus S.
Jain A.K.
Leon S.
Li C.
Li S.Z.
Likert R.
Livingstone S.R.
Lu H.
Matta F.
Mutalib S.
Pan L.
Pervouchine V.
Pisani P.H.
Proenca H.
Ricanek K.
Rodrigues R.N.
Roli F.
Santos O.C.
Schuller B.
Tapia J.E.
Teh P.S.
Wang Z.‐H.
Yan H.
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 16/11/2017
Field of study

Interest in the exploitation of soft biometrics information has continued to develop over the last decade or so. In comparison with traditional biometrics, which focuses principally on person identification, the idea of soft biometrics processing is to study the utilisation of more general information regarding a system user, which is not necessarily unique. There are increasing indications that this type of data will have great value in providing complementary information for user authentication. However, the authors have also seen a growing interest in broadening the predictive capabilities of biometric data, encompassing both easily definable characteristics such as subject age and, most recently, `higher level' characteristics such as emotional or mental states. This study will present a selective review of the predictive capabilities, in the widest sense, of biometric data processing, providing an analysis of the key issues still adequately to be addressed if this concept of predictive biometrics is to be fully exploited in the future

Crossref

Sheffield Hallam University Research Archive