Search CORE

229 research outputs found

Sublinear algorithms for local graph centrality estimation

Author: Bressan Marco
Peserico Enoch
Pretto Luca
Publication venue
Publication date: 01/01/2018
Field of study

We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of

m

arcs, with probability

(1-\delta)

computes a multiplicative

(1\pm\epsilon)

-approximation of its score by examining only

\tilde{O}(\min(m^{2/3} \Delta^{1/3} d^{-2/3},\, m^{4/5} d^{-3/5}))

nodes/arcs, where

\Delta

and

d

are respectively the maximum and average outdegree of the graph (omitting for readability

\operatorname{poly}(\epsilon^{-1})

and

\operatorname{polylog}(\delta^{-1})

factors). A similar bound holds for computational complexity. We also prove a lower bound of

\Omega(\min(m^{1/2} \Delta^{1/2} d^{-1/2}, \, m^{2/3} d^{-1/3}))

for both query complexity and computational complexity. Moreover, our technique yields a

\tilde{O}(n^{2/3})

query complexity algorithm for the graph access model of [Brautbar et al., 2010], widely used in social network mining; we show this algorithm is optimal up to a sublogarithmic factor. These are the first algorithms yielding worst-case sublinear bounds for general directed graphs and any choice of the target node.Comment: 29 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Padova

Graph diffusions and matrix functions: fast algorithms and localization results

Author: Kloster Kyle
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Network analysis provides tools for addressing fundamental applications in graphs such as webpage ranking, protein-function prediction, and product categorization and recommendation. As real-world networks grow to have millions of nodes and billions of edges, the scalability of network analysis algorithms becomes increasingly important. Whereas many standard graph algorithms rely on matrix-vector operations that require exploring the entire graph, this thesis is concerned with graph algorithms that are local (that explore only the graph region near the nodes of interest) as well as the localized behavior of global algorithms. We prove that two well-studied matrix functions for graph analysis, PageRank and the matrix exponential, stay localized on networks that have a skewed degree sequence related to the power-law degree distribution common to many real-world networks. Our results give the first theoretical explanation of a localization phenomenon that has long been observed in real-world networks. We prove our novel method for the matrix exponential converges in sublinear work on graphs with the specified degree sequence, and we adapt our method to produce the first deterministic algorithm for computing the related heat kernel diffusion in constant-time. Finally, we generalize this framework to compute any graph diffusion in constant time

Purdue E-Pubs

Two Taylor Algorithms for Computing the Action of the Matrix Exponential on a Vector

Author: Alonso Abalos José Miguel
Alonso-Jordá Pedro
Defez Candel Emilio
Ibáñez González Jacinto Javier
Sastre Jorge
Publication venue: MDPI AG
Publication date: 01/01/2022
Field of study

Ibáñez González, JJ.; Alonso Abalos, JM.; Alonso-Jordá, P.; Defez Candel, E.; Sastre, J. (2022). Two Taylor Algorithms for Computing the Action of the Matrix Exponential on a Vector. Algorithms. 15(2):1-48. https://doi.org/10.3390/a1502004814815

Directory of Open Access Journals

RiuNet

Networked Signal and Information Processing

Author: Kar Soummya
Moura José M. F.
Sayed Ali H.
Vlaski Stefan
Publication venue
Publication date: 18/04/2023
Field of study

The article reviews significant advances in networked signal and information processing, which have enabled in the last 25 years extending decision making and inference, optimization, control, and learning to the increasingly ubiquitous environments of distributed agents. As these interacting agents cooperate, new collective behaviors emerge from local decisions and actions. Moreover, and significantly, theory and applications show that networked agents, through cooperation and sharing, are able to match the performance of cloud or federated solutions, while offering the potential for improved privacy, increasing resilience, and saving resources

arXiv.org e-Print Archive

Approximating Properties of Data Streams

Author: Zhou Samson
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2018
Field of study

In this dissertation, we present algorithms that approximate properties in the data stream model, where elements of an underlying data set arrive sequentially, but algorithms must use space sublinear in the size of the underlying data set. We first study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n − p differs from its suffix of length n − p in at most k locations. We give algorithms to compute the k-periods of a string S using poly(k, log n) bits of space and we complement these results with comparable lower bounds. We then study the problem of identifying a longest substring of strings S and T of length n that forms a d-near-alignment under the edit distance, in the simultaneous streaming model. In this model, symbols of strings S and T are streamed at the same time and form a d-near-alignment if the distance between them in some given metric is at most d. We give several algorithms, including an exact one-pass algorithm that uses O(d2 + d log n) bits of space. We then consider the distinct elements and `p-heavy hitters problems in the sliding window model, where only the most recent n elements in the data stream form the underlying set. We first introduce the composable histogram, a simple twist on the exponential (Datar et al., SODA 2002) and smooth histograms (Braverman and Ostrovsky, FOCS 2007) that may be of independent interest. We then show that the composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and `p-heavy hitters that is nearly optimal in both n and c. Finally, we consider the problem of estimating the maximum weighted matching of a graph whose edges are revealed in a streaming fashion. We develop a reduction from the maximum weighted matching problem to the maximum cardinality matching problem that only doubles the approximation factor of a streaming algorithm developed for the maximum cardinality matching problem. As an application, we obtain an estimator for the weight of a maximum weighted matching in bounded-arboricity graphs and in particular, a (48 + )-approximation estimator for the weight of a maximum weighted matching in planar graphs

Purdue E-Pubs

Asynchronous Approximation of a Single Component of the Solution to a Linear System

Author: Ozdaglar Asuman
Shah Devavrat
Yu Christina Lee
Publication venue
Publication date: 21/01/2019
Field of study

We present a distributed asynchronous algorithm for approximating a single component of the solution to a system of linear equations

Ax = b

, where

A

is a positive definite real matrix, and

b \in \mathbb{R}^n

. This is equivalent to solving for

x_i

x = Gx + z

for some

G

and

z

such that the spectral radius of

G

is less than 1. Our algorithm relies on the Neumann series characterization of the component

x_i

, and is based on residual updates. We analyze our algorithm within the context of a cloud computation model, in which the computation is split into small update tasks performed by small processors with shared access to a distributed file system. We prove a robust asymptotic convergence result when the spectral radius

\rho(|G|) < 1

, regardless of the precise order and frequency in which the update tasks are performed. We provide convergence rate bounds which depend on the order of update tasks performed, analyzing both deterministic update rules via counting weighted random walks, as well as probabilistic update rules via concentration bounds. The probabilistic analysis requires analyzing the product of random matrices which are drawn from distributions that are time and path dependent. We specifically consider the setting where

n

is large, yet

G

is sparse, e.g., each row has at most

d

nonzero entries. This is motivated by applications in which

G

is derived from the edge structure of an underlying graph. Our results prove that if the local neighborhood of the graph does not grow too quickly as a function of

n

, our algorithm can provide significant reduction in computation cost as opposed to any algorithm which computes the global solution vector

x

. Our algorithm obtains an

\epsilon \|x\|_2

additive approximation for

x_i

in constant time with respect to the size of the matrix when the maximum row sparsity

d = O(1)

and

1/(1-\|G\|_2) = O(1)

arXiv.org e-Print Archive

DSpace@MIT