64 research outputs found
Spectrally approximating large graphs with smaller graphs
How does coarsening affect the spectrum of a general graph? We provide
conditions such that the principal eigenvalues and eigenspaces of a coarsened
and original graph Laplacian matrices are close. The achieved approximation is
shown to depend on standard graph-theoretic properties, such as the degree and
eigenvalue distributions, as well as on the ratio between the coarsened and
actual graph sizes. Our results carry implications for learning methods that
utilize coarsening. For the particular case of spectral clustering, they imply
that coarse eigenvectors can be used to derive good quality assignments even
without refinement---this phenomenon was previously observed, but lacked formal
justification.Comment: 22 pages, 10 figure
A Novel Differentiable Loss Function for Unsupervised Graph Neural Networks in Graph Partitioning
In this paper, we explore the graph partitioning problem, a pivotal
combina-torial optimization challenge with extensive applications in various
fields such as science, technology, and business. Recognized as an NP-hard
prob-lem, graph partitioning lacks polynomial-time algorithms for its
resolution. Recently, there has been a burgeoning interest in leveraging
machine learn-ing, particularly approaches like supervised, unsupervised, and
reinforce-ment learning, to tackle such NP-hard problems. However, these
methods face significant hurdles: supervised learning is constrained by the
necessity of labeled solution instances, which are often computationally
impractical to obtain; reinforcement learning grapples with instability in the
learning pro-cess; and unsupervised learning contends with the absence of a
differentia-ble loss function, a consequence of the discrete nature of most
combinatorial optimization problems. Addressing these challenges, our research
introduces a novel pipeline employing an unsupervised graph neural network to
solve the graph partitioning problem. The core innovation of this study is the
for-mulation of a differentiable loss function tailored for this purpose. We
rigor-ously evaluate our methodology against contemporary state-of-the-art
tech-niques, focusing on metrics: cuts and balance, and our findings reveal
that our is competitive with these leading methods.Comment: 2 Tables, 2 Figure
Relaxation-Based Coarsening for Multilevel Hypergraph Partitioning
Multilevel partitioning methods that are inspired by principles of
multiscaling are the most powerful practical hypergraph partitioning solvers.
Hypergraph partitioning has many applications in disciplines ranging from
scientific computing to data science. In this paper we introduce the concept of
algebraic distance on hypergraphs and demonstrate its use as an algorithmic
component in the coarsening stage of multilevel hypergraph partitioning
solvers. The algebraic distance is a vertex distance measure that extends
hyperedge weights for capturing the local connectivity of vertices which is
critical for hypergraph coarsening schemes. The practical effectiveness of the
proposed measure and corresponding coarsening scheme is demonstrated through
extensive computational experiments on a diverse set of problems. Finally, we
propose a benchmark of hypergraph partitioning problems to compare the quality
of other solvers
Scheduling Storms and Streams in the Cloud
Motivated by emerging big streaming data processing paradigms (e.g., Twitter
Storm, Streaming MapReduce), we investigate the problem of scheduling graphs
over a large cluster of servers. Each graph is a job, where nodes represent
compute tasks and edges indicate data-flows between these compute tasks. Jobs
(graphs) arrive randomly over time, and upon completion, leave the system. When
a job arrives, the scheduler needs to partition the graph and distribute it
over the servers to satisfy load balancing and cost considerations.
Specifically, neighboring compute tasks in the graph that are mapped to
different servers incur load on the network; thus a mapping of the jobs among
the servers incurs a cost that is proportional to the number of "broken edges".
We propose a low complexity randomized scheduling algorithm that, without
service preemptions, stabilizes the system with graph arrivals/departures; more
importantly, it allows a smooth trade-off between minimizing average
partitioning cost and average queue lengths. Interestingly, to avoid service
preemptions, our approach does not rely on a Gibbs sampler; instead, we show
that the corresponding limiting invariant measure has an interpretation
stemming from a loss system.Comment: 14 page
- …