4,026 research outputs found
The Ubiquity of Large Graphs and Surprising Challenges of Graph Processing: Extended Survey
Graph processing is becoming increasingly prevalent across many application
domains. In spite of this prevalence, there is little research about how graphs
are actually used in practice. We performed an extensive study that consisted
of an online survey of 89 users, a review of the mailing lists, source
repositories, and whitepapers of a large suite of graph software products, and
in-person interviews with 6 users and 2 developers of these products. Our
online survey aimed at understanding: (i) the types of graphs users have; (ii)
the graph computations users run; (iii) the types of graph software users use;
and (iv) the major challenges users face when processing their graphs. We
describe the participants' responses to our questions highlighting common
patterns and challenges. Based on our interviews and survey of the rest of our
sources, we were able to answer some new questions that were raised by
participants' responses to our online survey and understand the specific
applications that use graph data and software. Our study revealed surprising
facts about graph processing in practice. In particular, real-world graphs
represent a very diverse range of entities and are often very large,
scalability and visualization are undeniably the most pressing challenges faced
by participants, and data integration, recommendations, and fraud detection are
very popular applications supported by existing graph software. We hope these
findings can guide future research
Contrasting Views of Complexity and Their Implications For Network-Centric Infrastructures
There exists a widely recognized need to better understand
and manage complex âsystems of systems,â ranging from
biology, ecology, and medicine to network-centric technologies.
This is motivating the search for universal laws of highly evolved
systems and driving demand for new mathematics and methods
that are consistent, integrative, and predictive. However, the theoretical
frameworks available today are not merely fragmented
but sometimes contradictory and incompatible. We argue that
complexity arises in highly evolved biological and technological
systems primarily to provide mechanisms to create robustness.
However, this complexity itself can be a source of new fragility,
leading to ârobust yet fragileâ tradeoffs in system design. We
focus on the role of robustness and architecture in networked
infrastructures, and we highlight recent advances in the theory
of distributed control driven by network technologies. This view
of complexity in highly organized technological and biological systems
is fundamentally different from the dominant perspective in
the mainstream sciences, which downplays function, constraints,
and tradeoffs, and tends to minimize the role of organization and
design
Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks
Heterogeneous information networks (HINs) are ubiquitous in real-world
applications. In the meantime, network embedding has emerged as a convenient
tool to mine and learn from networked data. As a result, it is of interest to
develop HIN embedding methods. However, the heterogeneity in HINs introduces
not only rich information but also potentially incompatible semantics, which
poses special challenges to embedding learning in HINs. With the intention to
preserve the rich yet potentially incompatible information in HIN embedding, we
propose to study the problem of comprehensive transcription of heterogeneous
information networks. The comprehensive transcription of HINs also provides an
easy-to-use approach to unleash the power of HINs, since it requires no
additional supervision, expertise, or feature engineering. To cope with the
challenges in the comprehensive transcription of HINs, we propose the HEER
algorithm, which embeds HINs via edge representations that are further coupled
with properly-learned heterogeneous metrics. To corroborate the efficacy of
HEER, we conducted experiments on two large-scale real-words datasets with an
edge reconstruction task and multiple case studies. Experiment results
demonstrate the effectiveness of the proposed HEER model and the utility of
edge representations and heterogeneous metrics. The code and data are available
at https://github.com/GentleZhu/HEER.Comment: 10 pages. In Proceedings of the 24th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, London, United Kingdom,
ACM, 201
Low-bit Quantization for Deep Graph Neural Networks with Smoothness-aware Message Propagation
Graph Neural Network (GNN) training and inference involve significant
challenges of scalability with respect to both model sizes and number of
layers, resulting in degradation of efficiency and accuracy for large and deep
GNNs. We present an end-to-end solution that aims to address these challenges
for efficient GNNs in resource constrained environments while avoiding the
oversmoothing problem in deep GNNs. We introduce a quantization based approach
for all stages of GNNs, from message passing in training to node
classification, compressing the model and enabling efficient processing. The
proposed GNN quantizer learns quantization ranges and reduces the model size
with comparable accuracy even under low-bit quantization. To scale with the
number of layers, we devise a message propagation mechanism in training that
controls layer-wise changes of similarities between neighboring nodes. This
objective is incorporated into a Lagrangian function with constraints and a
differential multiplier method is utilized to iteratively find optimal
embeddings. This mitigates oversmoothing and suppresses the quantization error
to a bound. Significant improvements are demonstrated over state-of-the-art
quantization methods and deep GNN approaches in both full-precision and
quantized models. The proposed quantizer demonstrates superior performance in
INT2 configurations across all stages of GNN, achieving a notable level of
accuracy. In contrast, existing quantization approaches fail to generate
satisfactory accuracy levels. Finally, the inference with INT2 and INT4
representations exhibits a speedup of 5.11 and 4.70 compared
to full precision counterparts, respectively.Comment: To appear in CIKM202
The Future is Big Graphs! A Community View on Graph Processing Systems
Graphs are by nature unifying abstractions that can leverage
interconnectedness to represent, explore, predict, and explain real- and
digital-world phenomena. Although real users and consumers of graph instances
and graph workloads understand these abstractions, future problems will require
new abstractions and systems. What needs to happen in the next decade for big
graph processing to continue to succeed?Comment: 12 pages, 3 figures, collaboration between the large-scale systems
and data management communities, work started at the Dagstuhl Seminar 19491
on Big Graph Processing Systems, to be published in the Communications of the
AC
The Linked Data Benchmark Council (LDBC): Driving competition and collaboration in the graph data management space
Graph data management is instrumental for several use cases such as
recommendation, root cause analysis, financial fraud detection, and enterprise
knowledge representation. Efficiently supporting these use cases yields a
number of unique requirements, including the need for a concise query language
and graph-aware query optimization techniques. The goal of the Linked Data
Benchmark Council (LDBC) is to design a set of standard benchmarks that capture
representative categories of graph data management problems, making the
performance of systems comparable and facilitating competition among vendors.
LDBC also conducts research on graph schemas and graph query languages. This
paper introduces the LDBC organization and its work over the last decade
Incremental View Maintenance for Property Graph Queries
This paper discusses the challenges of incremental view maintenance for
property graph queries. We select a subset of property graph queries and
present an approach that uses nested relational algebra to allow incremental
evaluation
- âŠ