4,345 research outputs found
Deep Learning for Link Prediction in Dynamic Networks using Weak Estimators
Link prediction is the task of evaluating the probability that an edge exists in a network, and it has useful applications in many domains. Traditional approaches rely on measuring the similarity between two nodes in a static context. Recent research has focused on extending link prediction to a dynamic setting, predicting the creation and destruction of links in networks that evolve over time. Though a difficult task, the employment of deep learning techniques have shown to make notable improvements to the accuracy of predictions. To this end, we propose the novel application of weak estimators in addition to the utilization of traditional similarity metrics to inexpensively build an effective feature vector for a deep neural network. Weak estimators have been used in a variety of machine learning algorithms to improve model accuracy, owing to their capacity to estimate changing probabilities in dynamic systems. Experiments indicate that our approach results in increased prediction accuracy on several real-world dynamic networks
Relay-Linking Models for Prominence and Obsolescence in Evolving Networks
The rate at which nodes in evolving social networks acquire links (friends,
citations) shows complex temporal dynamics. Preferential attachment and link
copying models, while enabling elegant analysis, only capture rich-gets-richer
effects, not aging and decline. Recent aging models are complex and heavily
parameterized; most involve estimating 1-3 parameters per node. These
parameters are intrinsic: they explain decline in terms of events in the past
of the same node, and do not explain, using the network, where the linking
attention might go instead. We argue that traditional characterization of
linking dynamics are insufficient to judge the faithfulness of models. We
propose a new temporal sketch of an evolving graph, and introduce several new
characterizations of a network's temporal dynamics. Then we propose a new
family of frugal aging models with no per-node parameters and only two global
parameters. Our model is based on a surprising inversion or undoing of triangle
completion, where an old node relays a citation to a younger follower in its
immediate vicinity. Despite very few parameters, the new family of models shows
remarkably better fit with real data. Before concluding, we analyze temporal
signatures for various research communities yielding further insights into
their comparative dynamics. To facilitate reproducible research, we shall soon
make all the codes and the processed dataset available in the public domain
A high-level and scalable approach for generating scale-free graphs using active objects
The Barabasi-Albert model (BA) is designed to generate scale-free networks using the preferential attachment mechanism. In the preferential attachment (PA) model, new nodes are sequentially introduced to the network and they attach preferentially to existing nodes. PA is a classical model with a natural intuition, great explanatory power and a simple mechanism. Therefore, PA is widely-used for network generation. However the sequential mechanism used in the PA model makes it an inefficient algorithm. The existing parallel approaches, on the other hand, suffer from either changing the original model or explicit complex low-level synchronization mechanisms. In this paper we investigate a high-level Actor-based model of the parallel algorithm of network generation and its scalable multicore implementation in Haskell
Asynchronous programming in the abstract behavioural specification language
Chip manufacturers are rapidly moving towards so-called manycore chips with thousands of independent processors on the same silicon real estate. Current programming languages can only leverage the potential power by inserting code with low level concurrency constructs, sacrificing clarity. Alternatively, a programming language can integrate a thread of execution with a stable notion of identity, e.g., in active objects.Abstract Behavioural Specification (ABS) is a language for designing executable models of parallel and distributed object-oriented systems based on active objects, and is defined in terms of a formal operational semantics which enables a variety of static and dynamic analysis techniques for the ABS models.The overall goal of this thesis is to extend the asynchronous programming model and the corresponding analysis techniques in ABS.Algorithms and the Foundations of Software technolog
A Statistical Mechanical Load Balancer for the Web
The maximum entropy principle from statistical mechanics states that a closed
system attains an equilibrium distribution that maximizes its entropy. We first
show that for graphs with fixed number of edges one can define a stochastic
edge dynamic that can serve as an effective thermalization scheme, and hence,
the underlying graphs are expected to attain their maximum-entropy states,
which turn out to be Erdos-Renyi (ER) random graphs. We next show that (i) a
rate-equation based analysis of node degree distribution does indeed confirm
the maximum-entropy principle, and (ii) the edge dynamic can be effectively
implemented using short random walks on the underlying graphs, leading to a
local algorithm for the generation of ER random graphs. The resulting
statistical mechanical system can be adapted to provide a distributed and local
(i.e., without any centralized monitoring) mechanism for load balancing, which
can have a significant impact in increasing the efficiency and utilization of
both the Internet (e.g., efficient web mirroring), and large-scale computing
infrastructure (e.g., cluster and grid computing).Comment: 11 Pages, 5 Postscript figures; added references, expanded on
protocol discussio
A Java-Based Distributed Approach for Generating Large-Scale Social Network Graphs
Big Data management is an important topic of research not only in Computer Science, but also in several other domains. A challenging use of Big Data is the generation of large-scale graphs used to model social networks. In this paper, we present an actor-based Java library that eases the use of parallel and distributed programming using actors and s
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Language in Our Time: An Empirical Analysis of Hashtags
Hashtags in online social networks have gained tremendous popularity during
the past five years. The resulting large quantity of data has provided a new
lens into modern society. Previously, researchers mainly rely on data collected
from Twitter to study either a certain type of hashtags or a certain property
of hashtags. In this paper, we perform the first large-scale empirical analysis
of hashtags shared on Instagram, the major platform for hashtag-sharing. We
study hashtags from three different dimensions including the temporal-spatial
dimension, the semantic dimension, and the social dimension. Extensive
experiments performed on three large-scale datasets with more than 7 million
hashtags in total provide a series of interesting observations. First, we show
that the temporal patterns of hashtags can be categorized into four different
clusters, and people tend to share fewer hashtags at certain places and more
hashtags at others. Second, we observe that a non-negligible proportion of
hashtags exhibit large semantic displacement. We demonstrate hashtags that are
more uniformly shared among users, as quantified by the proposed hashtag
entropy, are less prone to semantic displacement. In the end, we propose a
bipartite graph embedding model to summarize users' hashtag profiles, and rely
on these profiles to perform friendship prediction. Evaluation results show
that our approach achieves an effective prediction with AUC (area under the ROC
curve) above 0.8 which demonstrates the strong social signals possessed in
hashtags.Comment: WWW 201
On futures for streaming data in abs
Many modern distributed software applications require a continuous interaction between their components exploiting streaming data from the server to the client. The Abstract Behavioral Specification (ABS) language has been developed for the modeling and analysis of distributed systems. In ABS, concurrent objects communicate by calling each other’s methods asynchronously. Return values are communicated asynchronously too via the return statement and so-called futures. In this paper, we extend the basic ABS model of asynchronous method invocation and return in order to support the streaming of data. We introduce the notion of a “Future-based Data Stream” to extend the ABS. The application of this notion and its impact on performance are illustrated by means of a case study in the domain of social networks simulation
- …