12,486 research outputs found
Multimedia big data computing for in-depth event analysis
While the most part of ”big data” systems target text-based analytics, multimedia data, which makes up about 2/3 of internet traffic, provide unprecedented opportunities for understanding and responding to real world situations and
challenges. Multimedia Big Data Computing is the new topic
that focus on all aspects of distributed computing systems that
enable massive scale image and video analytics. During the
course of this paper we describe BPEM (Big Picture Event
Monitor), a Multimedia Big Data Computing framework that
operates over streams of digital photos generated by online
communities, and enables monitoring the relationship between
real world events and social media user reaction in real-time.
As a case example, the paper examines publicly available social media data that relate to the Mobile World Congress 2014 that has been harvested and analyzed using the described system.Peer ReviewedPostprint (author's final draft
Learning Multimodal Latent Attributes
Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning
Cross-Modal Health State Estimation
Individuals create and consume more diverse data about themselves today than
any time in history. Sources of this data include wearable devices, images,
social media, geospatial information and more. A tremendous opportunity rests
within cross-modal data analysis that leverages existing domain knowledge
methods to understand and guide human health. Especially in chronic diseases,
current medical practice uses a combination of sparse hospital based biological
metrics (blood tests, expensive imaging, etc.) to understand the evolving
health status of an individual. Future health systems must integrate data
created at the individual level to better understand health status perpetually,
especially in a cybernetic framework. In this work we fuse multiple user
created and open source data streams along with established biomedical domain
knowledge to give two types of quantitative state estimates of cardiovascular
health. First, we use wearable devices to calculate cardiorespiratory fitness
(CRF), a known quantitative leading predictor of heart disease which is not
routinely collected in clinical settings. Second, we estimate inherent genetic
traits, living environmental risks, circadian rhythm, and biological metrics
from a diverse dataset. Our experimental results on 24 subjects demonstrate how
multi-modal data can provide personalized health insight. Understanding the
dynamic nature of health status will pave the way for better health based
recommendation engines, better clinical decision making and positive lifestyle
changes.Comment: Accepted to ACM Multimedia 2018 Conference - Brave New Ideas, Seoul,
Korea, ACM ISBN 978-1-4503-5665-7/18/1
Automatic Synchronization of Multi-User Photo Galleries
In this paper we address the issue of photo galleries synchronization, where
pictures related to the same event are collected by different users. Existing
solutions to address the problem are usually based on unrealistic assumptions,
like time consistency across photo galleries, and often heavily rely on
heuristics, limiting therefore the applicability to real-world scenarios. We
propose a solution that achieves better generalization performance for the
synchronization task compared to the available literature. The method is
characterized by three stages: at first, deep convolutional neural network
features are used to assess the visual similarity among the photos; then, pairs
of similar photos are detected across different galleries and used to construct
a graph; eventually, a probabilistic graphical model is used to estimate the
temporal offset of each pair of galleries, by traversing the minimum spanning
tree extracted from this graph. The experimental evaluation is conducted on
four publicly available datasets covering different types of events,
demonstrating the strength of our proposed method. A thorough discussion of the
obtained results is provided for a critical assessment of the quality in
synchronization.Comment: ACCEPTED to IEEE Transactions on Multimedi
- …