8,582 research outputs found
On Graph Stream Clustering with Side Information
Graph clustering becomes an important problem due to emerging applications
involving the web, social networks and bio-informatics. Recently, many such
applications generate data in the form of streams. Clustering massive, dynamic
graph streams is significantly challenging because of the complex structures of
graphs and computational difficulties of continuous data. Meanwhile, a large
volume of side information is associated with graphs, which can be of various
types. The examples include the properties of users in social network
activities, the meta attributes associated with web click graph streams and the
location information in mobile communication networks. Such attributes contain
extremely useful information and has the potential to improve the clustering
process, but are neglected by most recent graph stream mining techniques. In
this paper, we define a unified distance measure on both link structures and
side attributes for clustering. In addition, we propose a novel optimization
framework DMO, which can dynamically optimize the distance metric and make it
adapt to the newly received stream data. We further introduce a carefully
designed statistics SGS(C) which consume constant storage spaces with the
progression of streams. We demonstrate that the statistics maintained are
sufficient for the clustering process as well as the distance optimization and
can be scalable to massive graphs with side attributes. We will present
experiment results to show the advantages of the approach in graph stream
clustering with both links and side information over the baselines.Comment: Full version of SIAM SDM 2013 pape
Predicting Social Links for New Users across Aligned Heterogeneous Social Networks
Online social networks have gained great success in recent years and many of
them involve multiple kinds of nodes and complex relationships. Among these
relationships, social links among users are of great importance. Many existing
link prediction methods focus on predicting social links that will appear in
the future among all users based upon a snapshot of the social network. In
real-world social networks, many new users are joining in the service every
day. Predicting links for new users are more important. Different from
conventional link prediction problems, link prediction for new users are more
challenging due to the following reasons: (1) differences in information
distributions between new users and the existing active users (i.e., old
users); (2) lack of information from the new users in the network. We propose a
link prediction method called SCAN-PS (Supervised Cross Aligned Networks link
prediction with Personalized Sampling), to solve the link prediction problem
for new users with information transferred from both the existing active users
in the target network and other source networks through aligned accounts. We
proposed a within-target-network personalized sampling method to process the
existing active users' information in order to accommodate the differences in
information distributions before the intra-network knowledge transfer. SCAN-PS
can also exploit information in other source networks, where the user accounts
are aligned with the target network. In this way, SCAN-PS could solve the cold
start problem when information of these new users is total absent in the target
network.Comment: 11 pages, 10 figures, 4 table
Multi-Task Pharmacovigilance Mining from Social Media Posts
Social media has grown to be a crucial information source for
pharmacovigilance studies where an increasing number of people post adverse
reactions to medical drugs that are previously unreported. Aiming to
effectively monitor various aspects of Adverse Drug Reactions (ADRs) from
diversely expressed social medical posts, we propose a multi-task neural
network framework that learns several tasks associated with ADR monitoring with
different levels of supervisions collectively. Besides being able to correctly
classify ADR posts and accurately extract ADR mentions from online posts, the
proposed framework is also able to further understand reasons for which the
drug is being taken, known as 'indication', from the given social media post. A
coverage-based attention mechanism is adopted in our framework to help the
model properly identify 'phrasal' ADRs and Indications that are attentive to
multiple words in a post. Our framework is applicable in situations where
limited parallel data for different pharmacovigilance tasks are available.We
evaluate the proposed framework on real-world Twitter datasets, where the
proposed model outperforms the state-of-the-art alternatives of each individual
task consistently.Comment: Accepted in the research track of The Web Conference(WWW) 201
When and Where: Predicting Human Movements Based on Social Spatial-Temporal Events
Predicting both the time and the location of human movements is valuable but
challenging for a variety of applications. To address this problem, we propose
an approach considering both the periodicity and the sociality of human
movements. We first define a new concept, Social Spatial-Temporal Event (SSTE),
to represent social interactions among people. For the time prediction, we
characterise the temporal dynamics of SSTEs with an ARMA (AutoRegressive Moving
Average) model. To dynamically capture the SSTE kinetics, we propose a Kalman
Filter based learning algorithm to learn and incrementally update the ARMA
model as a new observation becomes available. For the location prediction, we
propose a ranking model where the periodicity and the sociality of human
movements are simultaneously taken into consideration for improving the
prediction accuracy. Extensive experiments conducted on real data sets validate
our proposed approach
- …
