316 research outputs found
Multimodal Classification of Urban Micro-Events
In this paper we seek methods to effectively detect urban micro-events. Urban
micro-events are events which occur in cities, have limited geographical
coverage and typically affect only a small group of citizens. Because of their
scale these are difficult to identify in most data sources. However, by using
citizen sensing to gather data, detecting them becomes feasible. The data
gathered by citizen sensing is often multimodal and, as a consequence, the
information required to detect urban micro-events is distributed over multiple
modalities. This makes it essential to have a classifier capable of combining
them. In this paper we explore several methods of creating such a classifier,
including early, late, hybrid fusion and representation learning using
multimodal graphs. We evaluate performance on a real world dataset obtained
from a live citizen reporting system. We show that a multimodal approach yields
higher performance than unimodal alternatives. Furthermore, we demonstrate that
our hybrid combination of early and late fusion with multimodal embeddings
performs best in classification of urban micro-events
$1.00 per RT #BostonMarathon #PrayForBoston: analyzing fake content on Twitter
This study found that 29% of the most viral content on Twitter during the Boston bombing crisis were rumors and fake content.AbstractOnline social media has emerged as one of the prominent channels for dissemination of information during real world events. Malicious content is posted online during events, which can result in damage, chaos and monetary losses in the real world. We analyzed one such media i.e. Twitter, for content generated during the event of Boston Marathon Blasts, that occurred on April, 15th, 2013. A lot of fake content and malicious profiles originated on Twitter network during this event. The aim of this work is to perform in-depth characterization of what factors influenced in malicious content and profiles becoming viral. Our results showed that 29% of the most viral content on Twitter, during the Boston crisis were rumors and fake content; while 51% was generic opinions and comments; and rest was true information. We found that large number of users with high social reputation and verified accounts were responsible for spreading the fake content. Next, we used regression prediction model, to verify that, overall impact of all users who propagate the fake content at a given time, can be used to estimate the growth of that content in future. Many malicious accounts were created on Twitter during the Boston event, that were later suspended by Twitter. We identified over six thousand such user profiles, we observed that the creation of such profiles surged considerably right after the blasts occurred. We identified closed community structure and star formation in the interaction network of these suspended profiles amongst themselves
Thinking spatial
The systems community in both academia and industry has tremendous success in building widely used general purpose systems for various types of data and applications. Examples include database systems, big data systems, data streaming systems, and machine learning systems. The vast majority of these systems are ill equipped in terms of supporting spatial data. The main reason is that system builders mostly think of spatial data as just one more type of data. Any spatial support can be considered as an afterthought problem that can be supported via on-top functions or spatial cartridges that can be added to the already built systems. This article advocates that spatial data and applications need to be natively supported in special purpose systems, where spatial data is considered as a first class citizen, while spatial operations are built inside the engine rather than on-top of it. System builders should consider spatial data while building their systems. The article gives examples of five categories of systems, namely, database systems, big data systems, machine learning systems, recommender systems, and social network systems, that would benefit tremendously, in terms of both accuracy and performance, when considering spatial data as an integral part of the system engine
Harnessing the power of the general public for crowdsourced business intelligence: a survey
International audienceCrowdsourced business intelligence (CrowdBI), which leverages the crowdsourced user-generated data to extract useful knowledge about business and create marketing intelligence to excel in the business environment, has become a surging research topic in recent years. Compared with the traditional business intelligence that is based on the firm-owned data and survey data, CrowdBI faces numerous unique issues, such as customer behavior analysis, brand tracking, and product improvement, demand forecasting and trend analysis, competitive intelligence, business popularity analysis and site recommendation, and urban commercial analysis. This paper first characterizes the concept model and unique features and presents a generic framework for CrowdBI. It also investigates novel application areas as well as the key challenges and techniques of CrowdBI. Furthermore, we make discussions about the future research directions of CrowdBI
Introduction toSocialRadio Case Studies and Perspectives
National audienceIn this paper, we introduce a Social Radio platform designed to support contextual communication between communities. Radio communication was no longer used for information broadcast using different communication channels. The same concept is used in this work but for different purposes. We set up a microblogging community channel to report, comment or simply vote some events. In order to understand the concept we detail a scenario in the epidemic field
Personalized Expert Recommendation: Models and Algorithms
Many large-scale information sharing systems including social media systems, questionanswering
sites and rating and reviewing applications have been growing rapidly, allowing
millions of human participants to generate and consume information on an unprecedented
scale. To manage the sheer growth of information generation, there comes the need to enable
personalization of information resources for users â to surface high-quality content
and feeds, to provide personally relevant suggestions, and so on. A fundamental task in
creating and supporting user-centered personalization systems is to build rich user profile
to aid recommendation for better user experience.
Therefore, in this dissertation research, we propose models and algorithms to facilitate
the creation of new crowd-powered personalized information sharing systems. Specifically,
we first give a principled framework to enable personalization of resources so that
information seekers can be matched with customized knowledgeable users based on their
previous historical actions and contextual information; We then focus on creating rich
user models that allows accurate and comprehensive modeling of user profiles for long
tail users, including discovering userâs known-for profile, userâs opinion bias and userâs
geo-topic profile. In particular, this dissertation research makes two unique contributions:
First, we introduce the problem of personalized expert recommendation and propose
the first principled framework for addressing this problem. To overcome the sparsity issue,
we investigate the use of userâs contextual information that can be exploited to build robust
models of personal expertise, study how spatial preference for personally-valuable expertise
varies across regions, across topics and based on different underlying social communities,
and integrate these different forms of preferences into a matrix factorization-based
personalized expert recommender.
Second, to support the personalized recommendation on experts, we focus on modeling
and inferring user profiles in online information sharing systems. In order to tap
the knowledge of most majority of users, we provide frameworks and algorithms to accurately
and comprehensively create user models by discovering userâs known-for profile,
userâs opinion bias and userâs geo-topic profile, with each described shortly as follows:
âWe develop a probabilistic model called Bayesian Contextual Poisson Factorization
to discover what users are known for by others. Our model considers as input a small fraction
of users whose known-for profiles are already known and the vast majority of users for
whom we have little (or no) information, learns the implicit relationships between user?s
known-for profiles and their contextual signals, and finally predict known-for profiles for
those majority of users.
âWe explore userâs topic-sensitive opinion bias, propose a lightweight semi-supervised
system called âBiasWatchâ to semi-automatically infer the opinion bias of long-tail users,
and demonstrate how userâs opinion bias can be exploited to recommend other users with
similar opinion in social networks.
â We study how a userâs topical profile varies geo-spatially and how we can model
a userâs geo-spatial known-for profile as the last step in our dissertation for creation of
rich user profile. We propose a multi-layered Bayesian hierarchical user factorization to
overcome user heterogeneity and an enhanced model to alleviate the sparsity issue by integrating
user contexts into the two-layered hierarchical user model for better representation
of userâs geo-topic preference by others
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientists, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
- âŠ