64 research outputs found
Towards Data-centric Graph Machine Learning: Review and Outlook
Data-centric AI, with its primary focus on the collection, management, and
utilization of data to drive AI models and applications, has attracted
increasing attention in recent years. In this article, we conduct an in-depth
and comprehensive review, offering a forward-looking outlook on the current
efforts in data-centric AI pertaining to graph data-the fundamental data
structure for representing and capturing intricate dependencies among massive
and diverse real-life entities. We introduce a systematic framework,
Data-centric Graph Machine Learning (DC-GML), that encompasses all stages of
the graph data lifecycle, including graph data collection, exploration,
improvement, exploitation, and maintenance. A thorough taxonomy of each stage
is presented to answer three critical graph-centric questions: (1) how to
enhance graph data availability and quality; (2) how to learn from graph data
with limited-availability and low-quality; (3) how to build graph MLOps systems
from the graph data-centric view. Lastly, we pinpoint the future prospects of
the DC-GML domain, providing insights to navigate its advancements and
applications.Comment: 42 pages, 9 figure
Sampling unknown large networks restricted by low sampling rates
Graph sampling plays an important role in data mining for large networks.
Specifically, larger networks often correspond to lower sampling rates. Under
the situation, traditional traversal-based samplings for large networks usually
have an excessive preference for densely-connected network core nodes. Aim at
this issue, this paper proposes a sampling method for unknown networks at low
sampling rates, called SLSR, which first adopts a random node sampling to
evaluate a degree threshold, utilized to distinguish the core from periphery,
and the average degree in unknown networks, and then runs a double-layer
sampling strategy on the core and periphery. SLSR is simple that results in a
high time efficiency, but experimental evaluation confirms that the proposed
method can accurately preserve many critical structures of unknown large
networks at sampling rates not exceeding 10%.Comment: 19 pages,14 figure
On relational learning and discovery in social networks: a survey
The social networking scene has evolved tremendously over the years. It has grown in relational complexities that extend a vast presence onto popular social media platforms on the internet. With the advance of sentimental computing and social complexity, relationships which were once thought to be simple have now become multi-dimensional and widespread in the online scene. This explosion in the online social scene has attracted much research attention. The main aims of this work revolve around the knowledge discovery and datamining processes of these feature-rich relations. In this paper, we provide a survey of relational learning and discovery through popular social analysis of different structure types which are integral to applications within the emerging field of sentimental and affective computing. It is hoped that this contribution will add to the clarity of how social networks are analyzed with the latest groundbreaking methods and provide certain directions for future improvements
Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing
Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM
A Comprehensive Survey on Deep Graph Representation Learning
Graph representation learning aims to effectively encode high-dimensional
sparse graph-structured data into low-dimensional dense vectors, which is a
fundamental task that has been widely studied in a range of fields, including
machine learning and data mining. Classic graph embedding methods follow the
basic idea that the embedding vectors of interconnected nodes in the graph can
still maintain a relatively close distance, thereby preserving the structural
information between the nodes in the graph. However, this is sub-optimal due
to: (i) traditional methods have limited model capacity which limits the
learning performance; (ii) existing techniques typically rely on unsupervised
learning strategies and fail to couple with the latest learning paradigms;
(iii) representation learning and downstream tasks are dependent on each other
which should be jointly enhanced. With the remarkable success of deep learning,
deep graph representation learning has shown great potential and advantages
over shallow (traditional) methods, there exist a large number of deep graph
representation learning techniques have been proposed in the past decade,
especially graph neural networks. In this survey, we conduct a comprehensive
survey on current deep graph representation learning algorithms by proposing a
new taxonomy of existing state-of-the-art literature. Specifically, we
systematically summarize the essential components of graph representation
learning and categorize existing approaches by the ways of graph neural network
architectures and the most recent advanced learning paradigms. Moreover, this
survey also provides the practical and promising applications of deep graph
representation learning. Last but not least, we state new perspectives and
suggest challenging directions which deserve further investigations in the
future
A Survey on Influence Maximization: From an ML-Based Combinatorial Optimization
Influence Maximization (IM) is a classical combinatorial optimization
problem, which can be widely used in mobile networks, social computing, and
recommendation systems. It aims at selecting a small number of users such that
maximizing the influence spread across the online social network. Because of
its potential commercial and academic value, there are a lot of researchers
focusing on studying the IM problem from different perspectives. The main
challenge comes from the NP-hardness of the IM problem and \#P-hardness of
estimating the influence spread, thus traditional algorithms for overcoming
them can be categorized into two classes: heuristic algorithms and
approximation algorithms. However, there is no theoretical guarantee for
heuristic algorithms, and the theoretical design is close to the limit.
Therefore, it is almost impossible to further optimize and improve their
performance. With the rapid development of artificial intelligence, the
technology based on Machine Learning (ML) has achieved remarkable achievements
in many fields. In view of this, in recent years, a number of new methods have
emerged to solve combinatorial optimization problems by using ML-based
techniques. These methods have the advantages of fast solving speed and strong
generalization ability to unknown graphs, which provide a brand-new direction
for solving combinatorial optimization problems. Therefore, we abandon the
traditional algorithms based on iterative search and review the recent
development of ML-based methods, especially Deep Reinforcement Learning, to
solve the IM problem and other variants in social networks. We focus on
summarizing the relevant background knowledge, basic principles, common
methods, and applied research. Finally, the challenges that need to be solved
urgently in future IM research are pointed out.Comment: 45 page
Towards Performance Portable Graph Algorithms
In today's data-driven world, our computational resources have become heterogeneous, making the processing of large-scale graphs in an architecture agnostic manner crucial. Traditionally, hand-optimized high-performance computing (HPC) solutions have been studied and used to implement highly efficient and scalable graph algorithms. In recent years, several graph processing and management systems have also been proposed. Hand optimized HPC approaches require high levels of expertise and graph processing frameworks suffer from expressibility and performance. Portability is a major concern for both approaches. The main thesis of this work is that block-based graph algorithms offer a compromise between efficient parallelism and architecture agnostic algorithm design for a wide class of graph problems. This dissertation seeks to prove this thesis by focusing the work on the three pillars; data/computation partitioning, block-based algorithm design, and performance portability.
In this dissertation, we first show how we can partition the computation and the data to design efficient block-based algorithms for solving graph merging and triangle counting problems. Then, generalizing from our experiences, we propose an algorithmic framework, for shared-memory, heterogeneous machines for implementing block-based graph algorithms; PGAbB. PGAbB aims to maximally leverage different architectures by implementing a task-based execution on top of a block-based programming model. In this talk we will discuss PGAbB's programming model, algorithmic optimizations for scheduling, and load-balancing strategies for graph problems on real-world and synthetic inputs.Ph.D
- …