229 research outputs found
Sublinear algorithms for local graph centrality estimation
We study the complexity of local graph centrality estimation, with the goal
of approximating the centrality score of a given target node while exploring
only a sublinear number of nodes/arcs of the graph and performing a sublinear
number of elementary operations. We develop a technique, that we apply to the
PageRank and Heat Kernel centralities, for building a low-variance score
estimator through a local exploration of the graph. We obtain an algorithm
that, given any node in any graph of arcs, with probability
computes a multiplicative -approximation of its score by
examining only nodes/arcs, where and are respectively the maximum and
average outdegree of the graph (omitting for readability
and
factors). A similar bound holds for computational complexity. We also prove a
lower bound of for both query complexity and computational complexity. Moreover,
our technique yields a query complexity algorithm for the
graph access model of [Brautbar et al., 2010], widely used in social network
mining; we show this algorithm is optimal up to a sublogarithmic factor. These
are the first algorithms yielding worst-case sublinear bounds for general
directed graphs and any choice of the target node.Comment: 29 pages, 1 figur
Graph diffusions and matrix functions: fast algorithms and localization results
Network analysis provides tools for addressing fundamental applications in graphs such as webpage ranking, protein-function prediction, and product categorization and recommendation. As real-world networks grow to have millions of nodes and billions of edges, the scalability of network analysis algorithms becomes increasingly important. Whereas many standard graph algorithms rely on matrix-vector operations that require exploring the entire graph, this thesis is concerned with graph algorithms that are local (that explore only the graph region near the nodes of interest) as well as the localized behavior of global algorithms. We prove that two well-studied matrix functions for graph analysis, PageRank and the matrix exponential, stay localized on networks that have a skewed degree sequence related to the power-law degree distribution common to many real-world networks. Our results give the first theoretical explanation of a localization phenomenon that has long been observed in real-world networks. We prove our novel method for the matrix exponential converges in sublinear work on graphs with the specified degree sequence, and we adapt our method to produce the first deterministic algorithm for computing the related heat kernel diffusion in constant-time. Finally, we generalize this framework to compute any graph diffusion in constant time
Two Taylor Algorithms for Computing the Action of the Matrix Exponential on a Vector
Ibáñez González, JJ.; Alonso Abalos, JM.; Alonso-Jordá, P.; Defez Candel, E.; Sastre, J. (2022). Two Taylor Algorithms for Computing the Action of the Matrix Exponential on a Vector. Algorithms. 15(2):1-48. https://doi.org/10.3390/a1502004814815
Networked Signal and Information Processing
The article reviews significant advances in networked signal and information
processing, which have enabled in the last 25 years extending decision making
and inference, optimization, control, and learning to the increasingly
ubiquitous environments of distributed agents. As these interacting agents
cooperate, new collective behaviors emerge from local decisions and actions.
Moreover, and significantly, theory and applications show that networked
agents, through cooperation and sharing, are able to match the performance of
cloud or federated solutions, while offering the potential for improved
privacy, increasing resilience, and saving resources
Approximating Properties of Data Streams
In this dissertation, we present algorithms that approximate properties in the data stream model, where elements of an underlying data set arrive sequentially, but algorithms must use space sublinear in the size of the underlying data set. We first study the problem of finding all k-periods of a length-n string S, presented as a data stream. S is said to have k-period p if its prefix of length n − p differs from its suffix of length n − p in at most k locations. We give algorithms to compute the k-periods of a string S using poly(k, log n) bits of space and we complement these results with comparable lower bounds. We then study the problem of identifying a longest substring of strings S and T of length n that forms a d-near-alignment under the edit distance, in the simultaneous streaming model. In this model, symbols of strings S and T are streamed at the same time and form a d-near-alignment if the distance between them in some given metric is at most d. We give several algorithms, including an exact one-pass algorithm that uses O(d2 + d log n) bits of space. We then consider the distinct elements and `p-heavy hitters problems in the sliding window model, where only the most recent n elements in the data stream form the underlying set. We first introduce the composable histogram, a simple twist on the exponential (Datar et al., SODA 2002) and smooth histograms (Braverman and Ostrovsky, FOCS 2007) that may be of independent interest. We then show that the composable histogram along with a careful combination of existing techniques to track either the identity or frequency of a few specific items suffices to obtain algorithms for both distinct elements and `p-heavy hitters that is nearly optimal in both n and c. Finally, we consider the problem of estimating the maximum weighted matching of a graph whose edges are revealed in a streaming fashion. We develop a reduction from the maximum weighted matching problem to the maximum cardinality matching problem that only doubles the approximation factor of a streaming algorithm developed for the maximum cardinality matching problem. As an application, we obtain an estimator for the weight of a maximum weighted matching in bounded-arboricity graphs and in particular, a (48 + )-approximation estimator for the weight of a maximum weighted matching in planar graphs
Asynchronous Approximation of a Single Component of the Solution to a Linear System
We present a distributed asynchronous algorithm for approximating a single
component of the solution to a system of linear equations , where
is a positive definite real matrix, and . This is
equivalent to solving for in for some and such that
the spectral radius of is less than 1. Our algorithm relies on the Neumann
series characterization of the component , and is based on residual
updates. We analyze our algorithm within the context of a cloud computation
model, in which the computation is split into small update tasks performed by
small processors with shared access to a distributed file system. We prove a
robust asymptotic convergence result when the spectral radius ,
regardless of the precise order and frequency in which the update tasks are
performed. We provide convergence rate bounds which depend on the order of
update tasks performed, analyzing both deterministic update rules via counting
weighted random walks, as well as probabilistic update rules via concentration
bounds. The probabilistic analysis requires analyzing the product of random
matrices which are drawn from distributions that are time and path dependent.
We specifically consider the setting where is large, yet is sparse,
e.g., each row has at most nonzero entries. This is motivated by
applications in which is derived from the edge structure of an underlying
graph. Our results prove that if the local neighborhood of the graph does not
grow too quickly as a function of , our algorithm can provide significant
reduction in computation cost as opposed to any algorithm which computes the
global solution vector . Our algorithm obtains an
additive approximation for in constant time with respect to the size of
the matrix when the maximum row sparsity and
- …