229,697 research outputs found
On Graph Stream Clustering with Side Information
Graph clustering becomes an important problem due to emerging applications
involving the web, social networks and bio-informatics. Recently, many such
applications generate data in the form of streams. Clustering massive, dynamic
graph streams is significantly challenging because of the complex structures of
graphs and computational difficulties of continuous data. Meanwhile, a large
volume of side information is associated with graphs, which can be of various
types. The examples include the properties of users in social network
activities, the meta attributes associated with web click graph streams and the
location information in mobile communication networks. Such attributes contain
extremely useful information and has the potential to improve the clustering
process, but are neglected by most recent graph stream mining techniques. In
this paper, we define a unified distance measure on both link structures and
side attributes for clustering. In addition, we propose a novel optimization
framework DMO, which can dynamically optimize the distance metric and make it
adapt to the newly received stream data. We further introduce a carefully
designed statistics SGS(C) which consume constant storage spaces with the
progression of streams. We demonstrate that the statistics maintained are
sufficient for the clustering process as well as the distance optimization and
can be scalable to massive graphs with side attributes. We will present
experiment results to show the advantages of the approach in graph stream
clustering with both links and side information over the baselines.Comment: Full version of SIAM SDM 2013 pape
Communication-Optimal Distributed Dynamic Graph Clustering
We consider the problem of clustering graph nodes over large-scale dynamic
graphs, such as citation networks, images and web networks, when graph updates
such as node/edge insertions/deletions are observed distributively. We propose
communication-efficient algorithms for two well-established communication
models namely the message passing and the blackboard models. Given a graph with
nodes that is observed at remote sites over time , the two
proposed algorithms have communication costs and
( hides a polylogarithmic factor), almost matching
their lower bounds, and , respectively, in the
message passing and the blackboard models. More importantly, we prove that at
each time point in our algorithms generate clustering quality nearly as
good as that of centralizing all updates up to that time and then applying a
standard centralized clustering algorithm. We conducted extensive experiments
on both synthetic and real-life datasets which confirmed the communication
efficiency of our approach over baseline algorithms while achieving comparable
clustering results.Comment: Accepted and to appear in AAAI'1
Graph-Level Embedding for Time-Evolving Graphs
Graph representation learning (also known as network embedding) has been
extensively researched with varying levels of granularity, ranging from nodes
to graphs. While most prior work in this area focuses on node-level
representation, limited research has been conducted on graph-level embedding,
particularly for dynamic or temporal networks. However, learning
low-dimensional graph-level representations for dynamic networks is critical
for various downstream graph retrieval tasks such as temporal graph similarity
ranking, temporal graph isomorphism, and anomaly detection. In this paper, we
present a novel method for temporal graph-level embedding that addresses this
gap. Our approach involves constructing a multilayer graph and using a modified
random walk with temporal backtracking to generate temporal contexts for the
graph's nodes. We then train a "document-level" language model on these
contexts to generate graph-level embeddings. We evaluate our proposed model on
five publicly available datasets for the task of temporal graph similarity
ranking, and our model outperforms baseline methods. Our experimental results
demonstrate the effectiveness of our method in generating graph-level
embeddings for dynamic networks.Comment: In Companion Proceedings of the ACM Web Conference 202
LLM4DyG: Can Large Language Models Solve Problems on Dynamic Graphs?
In an era marked by the increasing adoption of Large Language Models (LLMs)
for various tasks, there is a growing focus on exploring LLMs' capabilities in
handling web data, particularly graph data. Dynamic graphs, which capture
temporal network evolution patterns, are ubiquitous in real-world web data.
Evaluating LLMs' competence in understanding spatial-temporal information on
dynamic graphs is essential for their adoption in web applications, which
remains unexplored in the literature. In this paper, we bridge the gap via
proposing to evaluate LLMs' spatial-temporal understanding abilities on dynamic
graphs, to the best of our knowledge, for the first time. Specifically, we
propose the LLM4DyG benchmark, which includes nine specially designed tasks
considering the capability evaluation of LLMs from both temporal and spatial
dimensions. Then, we conduct extensive experiments to analyze the impacts of
different data generators, data statistics, prompting techniques, and LLMs on
the model performance. Finally, we propose Disentangled Spatial-Temporal
Thoughts (DST2) for LLMs on dynamic graphs to enhance LLMs' spatial-temporal
understanding abilities. Our main observations are: 1) LLMs have preliminary
spatial-temporal understanding abilities on dynamic graphs, 2) Dynamic graph
tasks show increasing difficulties for LLMs as the graph size and density
increase, while not sensitive to the time span and data generation mechanism,
3) the proposed DST2 prompting method can help to improve LLMs'
spatial-temporal understanding abilities on dynamic graphs for most tasks. The
data and codes will be open-sourced at publication time
DYMOND: DYnamic MOtif-NoDes Network Generative Model
Motifs, which have been established as building blocks for network structure,
move beyond pair-wise connections to capture longer-range correlations in
connections and activity. In spite of this, there are few generative graph
models that consider higher-order network structures and even fewer that focus
on using motifs in models of dynamic graphs. Most existing generative models
for temporal graphs strictly grow the networks via edge addition, and the
models are evaluated using static graph structure metrics -- which do not
adequately capture the temporal behavior of the network. To address these
issues, in this work we propose DYnamic MOtif-NoDes (DYMOND) -- a generative
model that considers (i) the dynamic changes in overall graph structure using
temporal motif activity and (ii) the roles nodes play in motifs (e.g., one node
plays the hub role in a wedge, while the remaining two act as spokes). We
compare DYMOND to three dynamic graph generative model baselines on real-world
networks and show that DYMOND performs better at generating graph structure and
node behavior similar to the observed network. We also propose a new
methodology to adapt graph structure metrics to better evaluate the temporal
aspect of the network. These metrics take into account the changes in overall
graph structure and the individual nodes' behavior over time.Comment: In Proceedings of the Web Conference 2021 (WWW '21
Google matrix of business process management
Development of efficient business process models and determination of their
characteristic properties are subject of intense interdisciplinary research.
Here, we consider a business process model as a directed graph. Its nodes
correspond to the units identified by the modeler and the link direction
indicates the causal dependencies between units. It is of primary interest to
obtain the stationary flow on such a directed graph, which corresponds to the
steady-state of a firm during the business process. Following the ideas
developed recently for the World Wide Web, we construct the Google matrix for
our business process model and analyze its spectral properties. The importance
of nodes is characterized by Page-Rank and recently proposed CheiRank and
2DRank, respectively. The results show that this two-dimensional ranking gives
a significant information about the influence and communication properties of
business model units. We argue that the Google matrix method, described here,
provides a new efficient tool helping companies to make their decisions on how
to evolve in the exceedingly dynamic global market.Comment: submitted to European Journal of Physics
- …