Search CORE

14 research outputs found

Scalable Parallel Factorizations of SDD Matrices and Efficient Sampling for Gaussian Graphical Models

Author: Cheng Dehua
Cheng Yu
Liu Yan
Peng Richard
Teng Shang-Hua
Publication venue
Publication date: 20/10/2014
Field of study

Motivated by a sampling problem basic to computational statistical inference, we develop a nearly optimal algorithm for a fundamental problem in spectral graph theory and numerical analysis. Given an

n\times n

SDDM matrix

{\bf \mathbf{M}}

, and a constant

-1 \leq p \leq 1

, our algorithm gives efficient access to a sparse

n\times n

linear operator

\tilde{\mathbf{C}}

such that

{\mathbf{M}}^{p} \approx \tilde{\mathbf{C}} \tilde{\mathbf{C}}^\top.

The solution is based on factoring

{\bf \mathbf{M}}

into a product of simple and sparse matrices using squaring and spectral sparsification. For

{\mathbf{M}}

with

m

non-zero entries, our algorithm takes work nearly-linear in

m

, and polylogarithmic depth on a parallel machine with

m

processors. This gives the first sampling algorithm that only requires nearly linear work and

n

i.i.d. random univariate Gaussian samples to generate i.i.d. random samples for

n

-dimensional Gaussian random fields with SDDM precision matrices. For sampling this natural subclass of Gaussian random fields, it is optimal in the randomness and nearly optimal in the work and parallel complexity. In addition, our sampling algorithm can be directly extended to Gaussian random fields with SDD precision matrices

arXiv.org e-Print Archive

CiteSeerX

An Efficient Parallel Algorithm for Spectral Sparsification of Laplacian and SDDM Matrix Polynomials

Author: Jindal Gorav
Kolev Pavel
Publication venue
Publication date: 01/01/2015
Field of study

For "large" class

\mathcal{C}

of continuous probability density functions (p.d.f.), we demonstrate that for every

w\in\mathcal{C}

there is mixture of discrete Binomial distributions (MDBD) with

T\geq N\sqrt{\phi_{w}/\delta}

distinct Binomial distributions

B(\cdot,N)

that

\delta

-approximates a discretized p.d.f.

\widehat{w}(i/N)\triangleq w(i/N)/[\sum_{\ell=0}^{N}w(\ell/N)]

for all

i\in[3:N-3]

, where

\phi_{w}\geq\max_{x\in[0,1]}|w(x)|

. Also, we give two efficient parallel algorithms to find such MDBD. Moreover, we propose a sequential algorithm that on input MDBD with

N=2^k

for

k\in\mathbb{N}_{+}

that induces a discretized p.d.f.

\beta

B=D-M

that is either Laplacian or SDDM matrix and parameter

\epsilon\in(0,1)

, outputs in

\widehat{O}(\epsilon^{-2}m + \epsilon^{-4}nT)

time a spectral sparsifier

D-\widehat{M}_{N} \approx_{\epsilon} D-D\sum_{i=0}^{N}\beta_{i}(D^{-1} M)^i

of a matrix-polynomial, where

\widehat{O}(\cdot)

notation hides

\mathrm{poly}(\log n,\log N)

factors. This improves the Cheng et al.'s [CCLPT15] algorithm whose run time is

\widehat{O}(\epsilon^{-2} m N^2 + NT)

. Furthermore, our algorithm is parallelizable and runs in work

\widehat{O}(\epsilon^{-2}m + \epsilon^{-4}nT)

and depth

O(\log N\cdot\mathrm{poly}(\log n)+\log T)

. Our main algorithmic contribution is to propose the first efficient parallel algorithm that on input continuous p.d.f.

w\in\mathcal{C}

, matrix

B=D-M

as above, outputs a spectral sparsifier of matrix-polynomial whose coefficients approximate component-wise the discretized p.d.f.

\widehat{w}

. Our results yield the first efficient and parallel algorithm that runs in nearly linear work and poly-logarithmic depth and analyzes the long term behaviour of Markov chains in non-trivial settings. In addition, we strengthen the Spielman and Peng's [PS14] parallel SDD solver

arXiv.org e-Print Archive

MPG.PuRe

Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

Author: Uçar Bora
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/08/2014
Field of study

Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Recommended from our members

Foundations of Node Representation Learning

Author: Chanpuriya Sudhanshu
Publication venue: ScholarWorks@UMass Amherst
Publication date: 14/11/2023
Field of study

Low-dimensional node representations, also called node embeddings, are a cornerstone in the modeling and analysis of complex networks. In recent years, advances in deep learning have spurred development of novel neural network-inspired methods for learning node representations which have largely surpassed classical \u27spectral\u27 embeddings in performance. Yet little work asks the central questions of this thesis: Why do these novel deep methods outperform their classical predecessors, and what are their limitations? We pursue several paths to answering these questions. To further our understanding of deep embedding methods, we explore their relationship with spectral methods, which are better understood, and show that some popular deep methods are equivalent to spectral methods in a certain natural limit. We also introduce the problem of inverting node embeddings in order to probe what information they contain. Further, we propose a simple, non-deep method for node representation learning, and find it to often be competitive with modern deep graph networks in downstream performance. To better understand the limitations of node embeddings, we prove some upper and lower bounds on their capabilities. Most notably, we prove that node embeddings are capable of exact low-dimensional representation of networks with bounded max degree or arboricity, and we further show that a simple algorithm can find such exact embeddings for real-world networks. By contrast, we also prove inherent bounds on random graph models, including those derived from node embeddings, to capture key structural properties of networks without simply memorizing a given graph

ScholarWorks@UMass Amherst

On non-linear network embedding methods

Author: Le Huong Yen
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2021
Field of study

As a linear method, spectral clustering is the only network embedding algorithm that offers both a provably fast computation and an advanced theoretical understanding. The accuracy of spectral clustering depends on the Cheeger ratio defined as the ratio between the graph conductance and the 2nd smallest eigenvalue of its normalizedLaplacian. In several graph families whose Cheeger ratio reaches its upper bound of Theta(n), the approximation power of spectral clustering is proven to perform poorly. Moreover, recent non-linear network embedding methods have surpassed spectral clustering by state-of-the-art performance with little to no theoretical understanding to back them. The dissertation includes work that: (1) extends the theory of spectral clustering in order to address its weakness and provide ground for a theoretical understanding of existing non-linear network embedding methods.; (2) provides non-linear extensions of spectral clustering with theoretical guarantees, e.g., via different spectral modification algorithms; (3) demonstrates the potentials of this approach on different types and sizes of graphs from industrial applications; and (4)makes a theory-informed use of artificial networks

Digital Commons @ New Jersey Institute of Technology (NJIT)

AVATAR - Machine Learning Pipeline Evaluation Using Surrogate Model

Author: A Barker
A Tsakonas
AGC Sá de
F Mohr
M Martin Salvador
MM Salvador
W Tan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/01/2020
Field of study

© 2020, The Author(s). The evaluation of machine learning (ML) pipelines is essential during automatic ML pipeline composition and optimisation. The previous methods such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods requires a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid, and it is unnecessary to execute them to find out whether they are good pipelines. To address this issue, we propose a novel method to evaluate the validity of ML pipelines using a surrogate model (AVATAR). The AVATAR enables to accelerate automatic ML pipeline composition and optimisation by quickly ignoring invalid pipelines. Our experiments show that the AVATAR is more efficient in evaluating complex pipelines in comparison with the traditional evaluation approaches requiring their execution

arXiv.org e-Print Archive

Crossref

OPUS - University of Technology Sydney

Scipedia

LIPIcs, Volume 261, ICALP 2023, Complete Volume

Author: Etessami Kousha
Feige Uriel
Puppis Gabriele
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023)
Publication date: 01/01/2023
Field of study

LIPIcs, Volume 261, ICALP 2023, Complete Volum

Dagstuhl Research Online Publication Server