35 research outputs found
Cultural Event Recognition with Visual ConvNets and Temporal Models
This paper presents our contribution to the ChaLearn Challenge 2015 on
Cultural Event Classification. The challenge in this task is to automatically
classify images from 50 different cultural events. Our solution is based on the
combination of visual features extracted from convolutional neural networks
with temporal information using a hierarchical classifier scheme. We extract
visual features from the last three fully connected layers of both CaffeNet
(pretrained with ImageNet) and our fine tuned version for the ChaLearn
challenge. We propose a late fusion strategy that trains a separate low-level
SVM on each of the extracted neural codes. The class predictions of the
low-level SVMs form the input to a higher level SVM, which gives the final
event scores. We achieve our best result by adding a temporal refinement step
into our classification scheme, which is applied directly to the output of each
low-level SVM. Our approach penalizes high classification scores based on
visual features when their time stamp does not match well an event-specific
temporal distribution learned from the training and validation data. Our system
achieved the second best result in the ChaLearn Challenge 2015 on Cultural
Event Classification with a mean average precision of 0.767 on the test set.Comment: Initial version of the paper accepted at the CVPR Workshop ChaLearn
Looking at People 201
Social Event Detection at MediaEval: a three-year retrospect of tasks and results
Petkos G, Papadopoulos S, Mezaris V, et al. Social Event Detection at MediaEval: a three-year retrospect of tasks and results. In: Proc. ACM ICMR 2014 Workshop on Social Events in Web Multimedia (SEWM). 2014.This paper presents an overview of the Social Event Detection (SED) task that has been running as part of the MediaEval benchmarking activity for three consecutive years (2011 - 2013). The task has focused on various aspects of social event detection and retrieval and has attracted a significant number of participants. We discuss the evolution of the task and the datasets, we summarize the set of approaches ursued by participants and evaluate the overall collective progress that has been achieved
Automatic Synchronization of Multi-User Photo Galleries
In this paper we address the issue of photo galleries synchronization, where
pictures related to the same event are collected by different users. Existing
solutions to address the problem are usually based on unrealistic assumptions,
like time consistency across photo galleries, and often heavily rely on
heuristics, limiting therefore the applicability to real-world scenarios. We
propose a solution that achieves better generalization performance for the
synchronization task compared to the available literature. The method is
characterized by three stages: at first, deep convolutional neural network
features are used to assess the visual similarity among the photos; then, pairs
of similar photos are detected across different galleries and used to construct
a graph; eventually, a probabilistic graphical model is used to estimate the
temporal offset of each pair of galleries, by traversing the minimum spanning
tree extracted from this graph. The experimental evaluation is conducted on
four publicly available datasets covering different types of events,
demonstrating the strength of our proposed method. A thorough discussion of the
obtained results is provided for a critical assessment of the quality in
synchronization.Comment: ACCEPTED to IEEE Transactions on Multimedi
Cross-Lingual Cross-Platform Rumor Verification Pivoting on Multimedia Content
With the increasing popularity of smart devices, rumors with multimedia
content become more and more common on social networks. The multimedia
information usually makes rumors look more convincing. Therefore, finding an
automatic approach to verify rumors with multimedia content is a pressing task.
Previous rumor verification research only utilizes multimedia as input
features. We propose not to use the multimedia content but to find external
information in other news platforms pivoting on it. We introduce a new features
set, cross-lingual cross-platform features that leverage the semantic
similarity between the rumors and the external information. When implemented,
machine learning methods utilizing such features achieved the state-of-the-art
rumor verification results
ReSEED: Social Event dEtection Dataset
Reuter T, Papadopoulos S, Mezaris V, Cimiano P. ReSEED: Social Event dEtection Dataset. In: MMSys '14. Proceedings of the 5th ACM Multimedia Systems Conference . New York: ACM; 2014: 35-40.Nowadays, digital cameras are very popular among people and quite every mobile phone has a build-in camera. Social events have a prominent role in people’s life. Thus, people take pictures of events they take part in and more and more of them upload these to well-known online photo community sites like Flickr. The number of pictures uploaded to these sites is still proliferating and there is a great interest in automatizing the process of event clustering so that every incoming (picture) document can be assigned to the corresponding event without the need of human interaction. These social events are defined as events that are planned by people, attended by people and for which the social multimedia are also captured by people. There is an urgent need to develop algorithms which are capable of grouping media by the social events they depict or are related to. In order to train, test, and evaluate such algorithms and frameworks, we present a dataset that consists of about 430,000 photos from Flickr together with the underlying ground truth consisting of about 21,000 social events. All the photos are accompanied by their textual metadata. The ground truth for the event groupings has been derived from event calendars on the Web that have been created collaboratively by people. The dataset has been used in the Social Event Detection (SED) task that was part of the MediaEval Benchmark for Multimedia Evaluation 2013. This task required participants to discover social events and organize the related media items in event-specific clusters within a collection of Web multimedia documents. In this paper we describe how the dataset has been collected and the creation of the ground truth together with a proposed evaluation methodology and a brief description of the corresponding task challenge as applied in the context of the Social Event Detection task
Deliverable D9.3 Final Project Report
This document comprises the final report of LinkedTV. It includes a publishable summary, a plan for use and dissemination of foreground and a report covering the wider societal implications of the project in the form of a questionnaire
Detection of Social Events in Streams of Social Multimedia
Combining items from social media streams, such as Flickr photos and Twitter tweets, into meaningful groups can help users contextualise and consume more effectively the torrents of information continuously being made available on the social web. This task is made challenging due to the scale of the streams and the inherently multimodal nature of the information being contextualised.The problem of grouping social media items into meaningful groups can be seen as an ill-posed and application specific unsupervised clustering problem. A fundamental question in multimodal contexts is determining which features best signify that two items should belong to the same grouping.This paper presents a methodology which approaches social event detection as a streaming multi-modal clustering task. The methodology takes advantage of the temporal nature of social events and as a side benefit, allows for scaling to real-world datasets. Specific challenges of the social event detection task are addressed: the engineering and selection of the features used to compare items to one another; a feature fusion strategy that incorporates relative importance of features; the construction of a single sparse affinity matrix; and clustering techniques which produce meaningful item groups whilst scaling to cluster very large numbers of items.The state-of-the-art approach presented here is evaluated using the ReSEED dataset with standardised evaluation measures. With automatically learned feature weights, we achieve an F1 score of 0.94, showing that a good compromise between precision and recall of clusters can be achieved. In a comparison with other state-of-the-art algorithms our approach is shown to give the best results
Deliverable D7.7 Dissemination and Standardisation Report v3
This deliverable presents the LinkedTV dissemination and standardisation report for the project period of months 31 to 42 (April 2014 to March 2015)
Deliverable D7.5 LinkedTV Dissemination and Standardisation Report v2
This deliverable presents the LinkedTV dissemination and standardisation report for the project period of months 19 to 30 (April 2013 to March 2014)