14,963 research outputs found
User profile preserving social network embedding
This paper addresses social network embedding, which aims to embed social network nodes, including user profile information, into a latent low-dimensional space. Most of the existing works on network embedding only consider network structure, but ignore user-generated content that could be potentially helpful in learning a better joint network representation. Different from rich node content in citation networks, user profile information in social networks is useful but noisy, sparse, and incomplete. To properly utilize this information, we propose a new algorithm called User Profile Preserving Social Network Embedding (UPP-SNE), which incorporates user profile with network structure to jointly learn a vector representation of a social network. The theme of UPP-SNE is to embed user profile information via a nonlinear mapping into a consistent subspace, where network structure is seamlessly encoded to jointly learn informative node representations. Extensive experiments on four real-world social networks show that compared to state-of-the-art baselines, our method learns better social network representations and achieves substantial performance gains in node classification and clustering tasks
Keystroke Biometrics in Response to Fake News Propagation in a Global Pandemic
This work proposes and analyzes the use of keystroke biometrics for content
de-anonymization. Fake news have become a powerful tool to manipulate public
opinion, especially during major events. In particular, the massive spread of
fake news during the COVID-19 pandemic has forced governments and companies to
fight against missinformation. In this context, the ability to link multiple
accounts or profiles that spread such malicious content on the Internet while
hiding in anonymity would enable proactive identification and blacklisting.
Behavioral biometrics can be powerful tools in this fight. In this work, we
have analyzed how the latest advances in keystroke biometric recognition can
help to link behavioral typing patterns in experiments involving 100,000 users
and more than 1 million typed sequences. Our proposed system is based on
Recurrent Neural Networks adapted to the context of content de-anonymization.
Assuming the challenge to link the typed content of a target user in a pool of
candidate profiles, our results show that keystroke recognition can be used to
reduce the list of candidate profiles by more than 90%. In addition, when
keystroke is combined with auxiliary data (such as location), our system
achieves a Rank-1 identification performance equal to 52.6% and 10.9% for a
background candidate list composed of 1K and 100K profiles, respectively.Comment: arXiv admin note: text overlap with arXiv:2004.0362
LATTE: Application Oriented Social Network Embedding
In recent years, many research works propose to embed the network structured
data into a low-dimensional feature space, where each node is represented as a
feature vector. However, due to the detachment of embedding process with
external tasks, the learned embedding results by most existing embedding models
can be ineffective for application tasks with specific objectives, e.g.,
community detection or information diffusion. In this paper, we propose study
the application oriented heterogeneous social network embedding problem.
Significantly different from the existing works, besides the network structure
preservation, the problem should also incorporate the objectives of external
applications in the objective function. To resolve the problem, in this paper,
we propose a novel network embedding framework, namely the "appLicAtion
orienTed neTwork Embedding" (Latte) model. In Latte, the heterogeneous network
structure can be applied to compute the node "diffusive proximity" scores,
which capture both local and global network structures. Based on these computed
scores, Latte learns the network representation feature vectors by extending
the autoencoder model model to the heterogeneous network scenario, which can
also effectively unite the objectives of network embedding and external
application tasks. Extensive experiments have been done on real-world
heterogeneous social network datasets, and the experimental results have
demonstrated the outstanding performance of Latte in learning the
representation vectors for specific application tasks.Comment: 11 Pages, 12 Figures, 1 Tabl
Privacy-Aware Recommender Systems Challenge on Twitter's Home Timeline
Recommender systems constitute the core engine of most social network
platforms nowadays, aiming to maximize user satisfaction along with other key
business objectives. Twitter is no exception. Despite the fact that Twitter
data has been extensively used to understand socioeconomic and political
phenomena and user behaviour, the implicit feedback provided by users on Tweets
through their engagements on the Home Timeline has only been explored to a
limited extent. At the same time, there is a lack of large-scale public social
network datasets that would enable the scientific community to both benchmark
and build more powerful and comprehensive models that tailor content to user
interests. By releasing an original dataset of 160 million Tweets along with
engagement information, Twitter aims to address exactly that. During this
release, special attention is drawn on maintaining compliance with existing
privacy laws. Apart from user privacy, this paper touches on the key challenges
faced by researchers and professionals striving to predict user engagements. It
further describes the key aspects of the RecSys 2020 Challenge that was
organized by ACM RecSys in partnership with Twitter using this dataset.Comment: 16 pages, 2 table
Neural‑Brane: Neural Bayesian Personalized Ranking for Attributed Network Embedding
Network embedding methodologies, which learn a distributed vector representation for each vertex in a network, have attracted considerable interest in recent years. Existing works have demonstrated that vertex representation learned through an embedding method provides superior performance in many real-world applications, such as node classification, link prediction, and community detection. However, most of the existing methods for network embedding only utilize topological information of a vertex, ignoring a rich set of nodal attributes (such as user profiles of an online social network, or textual contents of a citation network), which is abundant in all real-life networks. A joint network embedding that takes into account both attributional and relational information entails a complete network information and could further enrich the learned vector representations. In this work, we present Neural-Brane, a novel Neural Bayesian Personalized Ranking based Attributed Network Embedding. For a given network, Neural-Brane extracts latent feature representation of its vertices using a designed neural network model that unifies network topological information and nodal attributes. Besides, it utilizes Bayesian personalized ranking objective, which exploits the proximity ordering between a similar node pair and a dissimilar node pair. We evaluate the quality of vertex embedding produced by Neural-Brane by solving the node classification and clustering tasks on four real-world datasets. Experimental results demonstrate the superiority of our proposed method over the state-of-the-art existing methods
Language in Our Time: An Empirical Analysis of Hashtags
Hashtags in online social networks have gained tremendous popularity during
the past five years. The resulting large quantity of data has provided a new
lens into modern society. Previously, researchers mainly rely on data collected
from Twitter to study either a certain type of hashtags or a certain property
of hashtags. In this paper, we perform the first large-scale empirical analysis
of hashtags shared on Instagram, the major platform for hashtag-sharing. We
study hashtags from three different dimensions including the temporal-spatial
dimension, the semantic dimension, and the social dimension. Extensive
experiments performed on three large-scale datasets with more than 7 million
hashtags in total provide a series of interesting observations. First, we show
that the temporal patterns of hashtags can be categorized into four different
clusters, and people tend to share fewer hashtags at certain places and more
hashtags at others. Second, we observe that a non-negligible proportion of
hashtags exhibit large semantic displacement. We demonstrate hashtags that are
more uniformly shared among users, as quantified by the proposed hashtag
entropy, are less prone to semantic displacement. In the end, we propose a
bipartite graph embedding model to summarize users' hashtag profiles, and rely
on these profiles to perform friendship prediction. Evaluation results show
that our approach achieves an effective prediction with AUC (area under the ROC
curve) above 0.8 which demonstrates the strong social signals possessed in
hashtags.Comment: WWW 201
- …