658 research outputs found

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Semantic Labeling of Multimedia Content Clusters

    Full text link

    A multimodal framework for interactive sonification and sound-based communication

    Get PDF

    LifeLogging: personal big data

    Get PDF
    We have recently observed a convergence of technologies to foster the emergence of lifelogging as a mainstream activity. Computer storage has become significantly cheaper, and advancements in sensing technology allows for the efficient sensing of personal activities, locations and the environment. This is best seen in the growing popularity of the quantified self movement, in which life activities are tracked using wearable sensors in the hope of better understanding human performance in a variety of tasks. This review aims to provide a comprehensive summary of lifelogging, to cover its research history, current technologies, and applications. Thus far, most of the lifelogging research has focused predominantly on visual lifelogging in order to capture life details of life activities, hence we maintain this focus in this review. However, we also reflect on the challenges lifelogging poses to an information retrieval scientist. This review is a suitable reference for those seeking a information retrieval scientist’s perspective on lifelogging and the quantified self

    An Advanced A-V- Player to Support Scalable Personalised Interaction with Multi-Stream Video Content

    Get PDF
    PhDCurrent Audio-Video (A-V) players are limited to pausing, resuming, selecting and viewing a single video stream of a live broadcast event that is orchestrated by a professional director. The main objective of this research is to investigate how to create a new custom-built interactive A V player that enables viewers to personalise their own orchestrated views of live events from multiple simultaneous camera streams, via interacting with tracked moving objects, being able to zoom in and out of targeted objects, and being able to switch views based upon detected incidents in specific camera views. This involves research and development of a personalisation framework to create and maintain user profiles that are acquired implicitly and explicitly and modelling how this framework supports an evaluation of the effectiveness and usability of personalisation. Personalisation is considered from both an application oriented and a quality supervision oriented perspective within the proposed framework. Personalisation models can be individually or collaboratively linked with specific personalisation usage scenarios. The quality of different personalised interaction in terms of explicit evaluative metrics such as scalability and consistency can be monitored and measured using specific evaluation mechanisms.European Union's Seventh Framework Programme ([FP7/2007-2013]) under grant agreement No. ICT- 215248 and from Queen Mary University of London

    Bimodal Audiovisual Perception in Interactive Application Systems of Moderate Complexity

    Get PDF
    The dissertation at hand deals with aspects of quality perception of interactive audiovisual application systems of moderate complexity as e.g. defined in the MPEG-4 standard. Because in these systems the available computing power is limited, it is decisive to know which factors influence the perceived quality. Only then can the available computing power be distributed in the most effective and efficient way for the simulation and display of audiovisual 3D scenes. Whereas quality factors for the unimodal auditory and visual stimuli are well known and respective models of perception have been successfully devised based on this knowledge, this is not true for bimodal audiovisual perception. For the latter, it is only known that some kind of interdependency between auditory and visual perception does exist. The exact mechanisms of human audiovisual perception have not been described. It is assumed that interaction with an application or scene has a major influence upon the perceived overall quality. The goal of this work was to devise a system capable of performing subjective audiovisual assessments in the given context in a largely automated way. By applying the system, first evidence regarding audiovisual interdependency and influence of interaction upon perception should be collected. Therefore this work was composed of three fields of activities: the creation of a test bench based on the available but (regarding the audio functionality) somewhat restricted MPEG-4 player, the preoccupation with methods and framework requirements that ensure comparability and reproducibility of audiovisual assessments and results, and the performance of a series of coordinated experiments including the analysis and interpretation of the collected data. An object-based modular audio rendering engine was co-designed and -implemented which allows to perform simple room-acoustic simulations based on the MPEG-4 scene description paradigm in real-time. Apart from the MPEG-4 player, the test bench consists of a haptic Input Device used by test subjects to enter their quality ratings and a logging tool that allows to journalize all relevant events during an assessment session. The collected data can be exported comfortably for further analysis using appropriate statistic tools. A thorough analysis of the well established test methods and recommendations for unimodal subjective assessments was performed to find out whether a transfer to the audiovisual bimodal case is easily possible. It became evident that - due to the limited knowledge about the underlying perceptual processes - a novel categorization of experiments according to their goals could be helpful to organize the research in the field. Furthermore, a number of influencing factors could be identified that exercise control over bimodal perception in the given context. By performing the perceptual experiments using the devised system, its functionality and ease of use was verified. Apart from that, some first indications for the role of interaction in perceived overall quality have been collected: interaction in the auditory modality reduces a human's ability of correctly rating the audio quality, whereas visually based (cross-modal) interaction does not necessarily generate this effect.Die vorliegende Dissertation beschĂ€ftigt sich mit Aspekten der QualitĂ€tswahrnehmung von interaktiven audiovisuellen Anwendungssystemen moderater KomplexitĂ€t, wie sie z.B. durch den MPEG-4 Standard definiert sind. Die Frage, welche Faktoren Einfluss auf die wahrgenommene QualitĂ€t von audiovisuellen Anwendungssystemen haben ist entscheidend dafĂŒr, wie die nur begrenzt zur VerfĂŒgung stehende Rechenleistung fĂŒr die Echtzeit-Simulation von 3D Szenen und deren Darbietung sinnvoll verteilt werden soll. WĂ€hrend QualitĂ€tsfaktoren fĂŒr unimodale auditive als auch visuelle Stimuli seit langem bekannt sind und entsprechende Modelle existieren, mĂŒssen diese fĂŒr die bimodale audiovisuelle Wahrnehmung noch hergeleitet werden. Dabei ist bekannt, dass eine Wechselwirkung zwischen auditiver und visueller QualitĂ€t besteht, nicht jedoch, wie die Mechanismen menschlicher audiovisueller Wahrnehmung genau arbeiten. Es wird auch angenommen, dass der Faktor Interaktion einen wesentlichen Einfluss auf wahrgenommene QualitĂ€t hat. Das Ziel dieser Arbeit war, ein System fĂŒr die zeitsparende und weitgehend automatisierte DurchfĂŒhrung von subjektiven audiovisuellen Wahrnehmungstests im gegebenen Kontext zu erstellen und es fĂŒr einige exemplarische Experimente einzusetzen, welche erste Aussagen ĂŒber audiovisuelleWechselwirkungen und den Einfluss von Interaktion auf die Wahrnehmung erlauben sollten. Demzufolge gliederte sich die Arbeit in drei Aufgabenbereiche: die Erstellung eines geeigneten Testsystems auf der Grundlage eines vorhandenen, jedoch in seiner AudiofunktionalitĂ€t noch eingeschrĂ€nkten MPEG-4 Players, das Sicherstellen von Vergleichbarkeit und Wiederholbarkeit von audiovisuellen Wahrnehmungstests durch definierte Testmethoden und -bedingungen, und die eigentliche DurchfĂŒhrung der aufeinander abgestimmten Experimente mit anschlieĂżender Auswertung und Interpretation der gewonnenen Daten. Dazu wurde eine objektbasierte, modulare Audio-Engine mitentworfen und -implementiert, welche basierend auf den Möglichkeiten der MPEG-4 Szenenbeschreibung alle FĂ€higkeiten zur Echtzeitberechnung von Raumakustik bietet. Innerhalb des entwickelten Testsystems kommuniziert der MPEG-4 Player mit einem hardwaregestĂŒtzten Benutzerinterface zur Eingabe der QualitĂ€tsbewertungen durch die Testpersonen. SĂ€mtliche relevanten Ereignisse, die wĂ€hrend einer Testsession auftreten, können mit Hilfe eines Logging-Tools aufgezeichnet und fĂŒr die weitere Datenanalyse mit Statistikprogrammen exportiert werden. Eine Analyse der existierenden Testmethoden und -empfehlungen fĂŒr unimodale Wahrnehmungstests sollte zeigen, ob deren Übertragung auf den audiovisuellen Fall möglich ist. Dabei wurde deutlich, dass bedingt durch die fehlende Kenntnis der zugrundeliegenden Wahrnehmungsprozesse zunĂ€chst eine Unterteilung nach den Zielen der durchgefĂŒhrten Experimente sinnvoll erscheint. Weiterhin konnten Einflussfaktoren identifiziert werden, die die bimodale Wahrnehmung im gegebenen Kontext steuern. Bei der DurchfĂŒhrung der Wahrnehmungsexperimente wurde die FunktionsfĂ€higkeit des erstellten Testsystems verifiziert. DarĂŒber hinaus ergaben sich erste Anhaltspunkte fĂŒr den Einfluss von Interaktion auf die wahrgenommene GesamtqualitĂ€t: Interaktion in der auditiven ModalitĂ€t verringert die FĂ€higkeit, AudioqualitĂ€t korrekt beurteilen zu können, wĂ€hrend visuell gestĂŒtzte Interaktion (cross-modal) diesen Effekt nicht zwingend generiert

    Lifelog access modelling using MemoryMesh

    Get PDF
    As of very recently, we have observed a convergence of technologies that have led to the emergence of lifelogging as a technology for personal data application. Lifelogging will become ubiquitous in the near future, not just for memory enhancement and health management, but also in various other domains. While there are many devices available for gathering massive lifelogging data, there are still challenges to modelling large volume of multi-modal lifelog data. In the thesis, we explore and address the problem of how to model lifelog in order to make personal lifelogs more accessible to users from the perspective of collection, organization and visualization. In order to subdivide our research targets, we designed and followed the following steps to solve the problem: 1. Lifelog activity recognition. We use multiple sensor data to analyse various daily life activities. Data ranges from accelerometer data collected by mobile phones to images captured by wearable cameras. We propose a semantic, density-based algorithm to cope with concept selection issues for lifelogging sensory data. 2. Visual discovery of lifelog images. Most of the lifelog information we takeeveryday is in a form of images, so images contain significant information about our lives. Here we conduct some experiments on visual content analysis of lifelog images, which includes both image contents and image meta data. 3. Linkage analysis of lifelogs. By exploring linkage analysis of lifelog data, we can connect all lifelog images using linkage models into a concept called the MemoryMesh. The thesis includes experimental evaluations using real-life data collected from multiple users and shows the performance of our algorithms in detecting semantics of daily-life concepts and their effectiveness in activity recognition and lifelog retrieval
    • 

    corecore