6 research outputs found
Video Analysis Tools for Annotating User-Generated Content from Social Events
In this presentation we present how low-level metadata extraction tools have been applied in the context of a pan-European project called Together Anywhere, Together Anytime (TA2). The TA2 project studies new forms of computer-mediated social communications between spatially and temporally distant people. In particular, we concentrate on automatic video analysis tools in an asynchronous community-based video sharing environment called MyVideos, in which users can experience and share personalized music concert videos within their social grou
Automatic Time Skew Detection and Correction
In this paper, we propose a new approach for the automatic time skew detection and correction for multisource audiovisual data, recorded by different cameras/recorders during the same event. All recorded data are successfully tested for potential time skew problem and corrected based on ASR-related features. The core of the algorithm is based on perceptual time-quefrency analysis with a precision of 10 ms. The results show correct time skew detection and elimination in 100% of cases for a real life dataset of 32 broken sessions and surpass the performance of fast cross correlation while keeping lower system requirements
Social Focus of Attention as a Time Function Derived from Multimodal Signals
In this paper, we present the results of a study on the social focus of attention as a time function derived from the multisource multimodal signals, recorded by different personal capturing devices during social events. The core of the approach is based on fission and fusion of multichannel audio, video and social modalities to derive the social focus of attention. The results achieved to date on 16+ hours of real-life data prove the feasibility of the approach
Socially-Aware Multimedia Authoring
Bulterman, D.C.A. [Promotor]Cesar, P.S. [Copromotor
AUTOMATIC TEMPORAL ALIGNMENT OF AV DATA WITH CONFIDENCE ESTIMATION
In this paper, we propose a new approach for the automatic audio-based temporal alignment with confidence estimation of audio-visual data, recorded by different cameras, camcorders or mobile phones during social events. All recorded data is temporally aligned based on ASR-related features with a common master track, recorded by a reference camera, and the corresponding confidence of alignment is estimated. The core of the algorithm is based on perceptual time-frequency analysis with a precision of 10 ms. The results show correct alignment in 99 % of cases for a real life dataset and surpass the performance of cross correlation while keeping lower system requirements. Index Terms — time-frequency analysis, time synchronisation, pattern matching, reliability estimatio