658 research outputs found
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and âenablersâ, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
LifeLogging: personal big data
We have recently observed a convergence of technologies to foster the emergence of lifelogging as a mainstream activity. Computer storage has become significantly cheaper, and advancements in sensing technology allows for the efficient sensing of personal activities, locations and the environment. This is best seen in the growing popularity of the quantified self movement, in which life activities are tracked using wearable sensors in the hope of better understanding human performance in a variety of tasks. This review aims to provide a comprehensive summary of lifelogging, to cover its research history, current technologies, and applications. Thus far, most of the lifelogging research has focused predominantly on visual lifelogging in order to capture life details of life activities, hence we maintain this focus in this review. However, we also reflect on the challenges lifelogging poses to an information retrieval scientist. This review is a suitable reference for those seeking a information retrieval scientistâs perspective on lifelogging and the quantified self
An Advanced A-V- Player to Support Scalable Personalised Interaction with Multi-Stream Video Content
PhDCurrent Audio-Video (A-V) players are limited to pausing, resuming, selecting and
viewing a single video stream of a live broadcast event that is orchestrated by a
professional director. The main objective of this research is to investigate how to create a
new custom-built interactive A V player that enables viewers to personalise their own
orchestrated views of live events from multiple simultaneous camera streams, via
interacting with tracked moving objects, being able to zoom in and out of targeted
objects, and being able to switch views based upon detected incidents in specific camera
views. This involves research and development of a personalisation framework to create
and maintain user profiles that are acquired implicitly and explicitly and modelling how
this framework supports an evaluation of the effectiveness and usability of
personalisation.
Personalisation is considered from both an application oriented and a quality supervision
oriented perspective within the proposed framework. Personalisation models can be
individually or collaboratively linked with specific personalisation usage scenarios. The
quality of different personalised interaction in terms of explicit evaluative metrics such as
scalability and consistency can be monitored and measured using specific evaluation
mechanisms.European Union's
Seventh Framework Programme ([FP7/2007-2013]) under grant agreement No. ICT-
215248 and from Queen Mary University of London
Bimodal Audiovisual Perception in Interactive Application Systems of Moderate Complexity
The dissertation at hand deals with aspects of quality perception of
interactive audiovisual application systems of moderate complexity as e.g.
defined in the MPEG-4 standard. Because in these systems the available
computing power is limited, it is decisive to know which factors influence
the perceived quality. Only then can the available computing power be
distributed in the most effective and efficient way for the simulation and
display of audiovisual 3D scenes. Whereas quality factors for the unimodal
auditory and visual stimuli are well known and respective models of
perception have been successfully devised based on this knowledge, this is
not true for bimodal audiovisual perception. For the latter, it is only
known that some kind of interdependency between auditory and visual
perception does exist. The exact mechanisms of human audiovisual perception
have not been described. It is assumed that interaction with an application
or scene has a major influence upon the perceived overall quality.
The goal of this work was to devise a system capable of performing
subjective audiovisual assessments in the given context in a largely
automated way. By applying the system, first evidence regarding audiovisual
interdependency and influence of interaction upon perception should be
collected. Therefore this work was composed of three fields of activities:
the creation of a test bench based on the available but (regarding the
audio functionality) somewhat restricted MPEG-4 player, the preoccupation
with methods and framework requirements that ensure comparability and
reproducibility of audiovisual assessments and results, and the performance
of a series of coordinated experiments including the analysis and
interpretation of the collected data. An object-based modular audio
rendering engine was co-designed and -implemented which allows to perform
simple room-acoustic simulations based on the MPEG-4 scene description
paradigm in real-time. Apart from the MPEG-4 player, the test bench
consists of a haptic Input Device used by test subjects to enter their
quality ratings and a logging tool that allows to journalize all relevant
events during an assessment session. The collected data can be exported
comfortably for further analysis using appropriate statistic tools.
A thorough analysis of the well established test methods and
recommendations for unimodal subjective assessments was performed to find
out whether a transfer to the audiovisual bimodal case is easily possible.
It became evident that - due to the limited knowledge about the underlying
perceptual processes - a novel categorization of experiments according to
their goals could be helpful to organize the research in the field.
Furthermore, a number of influencing factors could be identified that
exercise control over bimodal perception in the given context.
By performing the perceptual experiments using the devised system, its
functionality and ease of use was verified. Apart from that, some first
indications for the role of interaction in perceived overall quality have
been collected: interaction in the auditory modality reduces a human's
ability of correctly rating the audio quality, whereas visually based
(cross-modal) interaction does not necessarily generate this effect.Die vorliegende Dissertation beschÀftigt sich mit Aspekten der
QualitÀtswahrnehmung von interaktiven audiovisuellen Anwendungssystemen
moderater KomplexitÀt, wie sie z.B. durch den MPEG-4 Standard definiert
sind. Die Frage, welche Faktoren Einfluss auf die wahrgenommene QualitÀt
von audiovisuellen Anwendungssystemen haben ist entscheidend dafĂŒr, wie die
nur begrenzt zur VerfĂŒgung stehende Rechenleistung fĂŒr die
Echtzeit-Simulation von 3D Szenen und deren Darbietung sinnvoll verteilt
werden soll. WĂ€hrend QualitĂ€tsfaktoren fĂŒr unimodale auditive als auch
visuelle Stimuli seit langem bekannt sind und entsprechende Modelle
existieren, mĂŒssen diese fĂŒr die bimodale audiovisuelle Wahrnehmung noch
hergeleitet werden. Dabei ist bekannt, dass eine Wechselwirkung zwischen
auditiver und visueller QualitÀt besteht, nicht jedoch, wie die Mechanismen
menschlicher audiovisueller Wahrnehmung genau arbeiten. Es wird auch
angenommen, dass der Faktor Interaktion einen wesentlichen Einfluss auf
wahrgenommene QualitÀt hat.
Das Ziel dieser Arbeit war, ein System fĂŒr die zeitsparende und weitgehend
automatisierte DurchfĂŒhrung von subjektiven audiovisuellen
Wahrnehmungstests im gegebenen Kontext zu erstellen und es fĂŒr einige
exemplarische Experimente einzusetzen, welche erste Aussagen ĂŒber
audiovisuelleWechselwirkungen und den Einfluss von Interaktion auf die
Wahrnehmung erlauben sollten. Demzufolge gliederte sich die Arbeit in drei
Aufgabenbereiche: die Erstellung eines geeigneten Testsystems auf der
Grundlage eines vorhandenen, jedoch in seiner AudiofunktionalitÀt noch
eingeschrÀnkten MPEG-4 Players, das Sicherstellen von Vergleichbarkeit und
Wiederholbarkeit von audiovisuellen Wahrnehmungstests durch definierte
Testmethoden und -bedingungen, und die eigentliche DurchfĂŒhrung der
aufeinander abgestimmten Experimente mit anschlieĂżender Auswertung und
Interpretation der gewonnenen Daten. Dazu wurde eine objektbasierte,
modulare Audio-Engine mitentworfen und -implementiert, welche basierend auf
den Möglichkeiten der MPEG-4 Szenenbeschreibung alle FÀhigkeiten zur
Echtzeitberechnung von Raumakustik bietet. Innerhalb des entwickelten
Testsystems kommuniziert der MPEG-4 Player mit einem hardwaregestĂŒtzten
Benutzerinterface zur Eingabe der QualitÀtsbewertungen durch die
Testpersonen. SÀmtliche relevanten Ereignisse, die wÀhrend einer
Testsession auftreten, können mit Hilfe eines Logging-Tools aufgezeichnet
und fĂŒr die weitere Datenanalyse mit Statistikprogrammen exportiert werden.
Eine Analyse der existierenden Testmethoden und -empfehlungen fĂŒr unimodale
Wahrnehmungstests sollte zeigen, ob deren Ăbertragung auf den
audiovisuellen Fall möglich ist. Dabei wurde deutlich, dass bedingt durch
die fehlende Kenntnis der zugrundeliegenden Wahrnehmungsprozesse zunÀchst
eine Unterteilung nach den Zielen der durchgefĂŒhrten Experimente sinnvoll
erscheint. Weiterhin konnten Einflussfaktoren identifiziert werden, die die
bimodale Wahrnehmung im gegebenen Kontext steuern.
Bei der DurchfĂŒhrung der Wahrnehmungsexperimente wurde die
FunktionsfĂ€higkeit des erstellten Testsystems verifiziert. DarĂŒber hinaus
ergaben sich erste Anhaltspunkte fĂŒr den Einfluss von Interaktion auf die
wahrgenommene GesamtqualitÀt: Interaktion in der auditiven ModalitÀt
verringert die FÀhigkeit, AudioqualitÀt korrekt beurteilen zu können,
wĂ€hrend visuell gestĂŒtzte Interaktion (cross-modal) diesen Effekt nicht
zwingend generiert
Lifelog access modelling using MemoryMesh
As of very recently, we have observed a convergence of technologies that have led to the emergence of lifelogging as a technology for personal data application. Lifelogging will become ubiquitous in the near future, not just for memory enhancement and health management, but also in various other domains. While there are many devices available for gathering massive lifelogging data, there are still challenges to modelling large volume of multi-modal lifelog data. In the thesis, we explore and address the problem of how to model lifelog in order to make personal lifelogs more accessible to users from the perspective of collection, organization and visualization. In order to subdivide our research targets, we designed and followed the following steps to solve the problem:
1. Lifelog activity recognition. We use multiple sensor data to analyse various daily life activities. Data ranges from accelerometer data collected by mobile phones to images captured by wearable cameras. We propose a semantic, density-based algorithm to cope with concept selection issues for lifelogging sensory data.
2. Visual discovery of lifelog images. Most of the lifelog information we takeeveryday is in a form of images, so images contain significant information about our lives. Here we conduct some experiments on visual content analysis of lifelog images, which includes both image contents and image meta data.
3. Linkage analysis of lifelogs. By exploring linkage analysis of lifelog data, we can connect all lifelog images using linkage models into a concept called the MemoryMesh. The thesis includes experimental evaluations using real-life data collected from multiple users and shows the performance of our algorithms in detecting semantics of daily-life concepts and their effectiveness in activity recognition and lifelog retrieval
- âŠ