171 research outputs found
The MediaEval 2016 Emotional Impact of Movies Task
Volume: 1739 Host publication title: MediaEval 2016 Multimedia Benchmark Workshop Host publication sub-title: Working Notes Proceedings of the MediaEval 2016 WorkshopNon peer reviewe
MediaEval 2016 Predicting Media Interestingness Task
Volume: 1739 Host publication title: MediaEval 2016 Multimedia Benchmark Workshop Host publication sub-title: Working Notes Proceedings of the MediaEval 2016 WorkshopNon peer reviewe
Exploiting multimedia in creating and analysing multimedia Web archives
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general
Cultural Event Recognition with Visual ConvNets and Temporal Models
This paper presents our contribution to the ChaLearn Challenge 2015 on
Cultural Event Classification. The challenge in this task is to automatically
classify images from 50 different cultural events. Our solution is based on the
combination of visual features extracted from convolutional neural networks
with temporal information using a hierarchical classifier scheme. We extract
visual features from the last three fully connected layers of both CaffeNet
(pretrained with ImageNet) and our fine tuned version for the ChaLearn
challenge. We propose a late fusion strategy that trains a separate low-level
SVM on each of the extracted neural codes. The class predictions of the
low-level SVMs form the input to a higher level SVM, which gives the final
event scores. We achieve our best result by adding a temporal refinement step
into our classification scheme, which is applied directly to the output of each
low-level SVM. Our approach penalizes high classification scores based on
visual features when their time stamp does not match well an event-specific
temporal distribution learned from the training and validation data. Our system
achieved the second best result in the ChaLearn Challenge 2015 on Cultural
Event Classification with a mean average precision of 0.767 on the test set.Comment: Initial version of the paper accepted at the CVPR Workshop ChaLearn
Looking at People 201
Adapting Binary Information Retrieval Evaluation Metrics for Segment-based Retrieval Tasks
This report describes metrics for the evaluation of the effectiveness of
segment-based retrieval based on existing binary information retrieval metrics.
This metrics are described in the context of a task for the hyperlinking of
video segments. This evaluation approach re-uses existing evaluation measures
from the standard Cranfield evaluation paradigm. Our adaptation approach can in
principle be used with any kind of effectiveness measure that uses binary
relevance, and for other segment-baed retrieval tasks. In our video
hyperlinking setting, we use precision at a cut-off rank n and mean average
precision.Comment: Explanation of evaluation measures for the linking task of the
MediaEval Workshop 201
Method for run time hardware code profiling for algorithm acceleration
In this paper we propose a method for run time profiling of applications on instruction level by analysis of loops. Instead of looking for coarse grain blocks we concentrate on fine grain but still costly blocks in terms of execution times. Most code profiling is done in software by introducing code into the application under profile witch has time overhead, while in this work data for the position of a loop, loop body, size and number of executions is stored and analysed using a small non intrusive hardware block. The paper describes the system mapping to runtime reconfigurable systems. The fine grain code detector block synthesis results and its functionality verification are also presented in the paper. To demonstrate the concept MediaBench multimedia benchmark running on the chosen development platform is use
Overview of The MediaEval 2022 Predicting Video Memorability Task
This paper describes the 5th edition of the Predicting Video Memorability
Task as part of MediaEval2022. This year we have reorganised and simplified the
task in order to lubricate a greater depth of inquiry. Similar to last year,
two datasets are provided in order to facilitate generalisation, however, this
year we have replaced the TRECVid2019 Video-to-Text dataset with the VideoMem
dataset in order to remedy underlying data quality issues, and to prioritise
short-term memorability prediction by elevating the Memento10k dataset as the
primary dataset. Additionally, a fully fledged electroencephalography
(EEG)-based prediction sub-task is introduced. In this paper, we outline the
core facets of the task and its constituent sub-tasks; describing the datasets,
evaluation metrics, and requirements for participant submissions.Comment: 6 pages. In: MediaEval Multimedia Benchmark Workshop Working Notes,
202
Stacked Convolutional and Recurrent Neural Networks for Music Emotion Recognition
This paper studies the emotion recognition from musical tracks in the
2-dimensional valence-arousal (V-A) emotional space. We propose a method based
on convolutional (CNN) and recurrent neural networks (RNN), having
significantly fewer parameters compared with the state-of-the-art method for
the same task. We utilize one CNN layer followed by two branches of RNNs
trained separately for arousal and valence. The method was evaluated using the
'MediaEval2015 emotion in music' dataset. We achieved an RMSE of 0.202 for
arousal and 0.268 for valence, which is the best result reported on this
dataset.Comment: Accepted for Sound and Music Computing (SMC 2017
- âŠ