326 research outputs found

    Graph diffusions and matrix functions: fast algorithms and localization results

    Get PDF
    Network analysis provides tools for addressing fundamental applications in graphs such as webpage ranking, protein-function prediction, and product categorization and recommendation. As real-world networks grow to have millions of nodes and billions of edges, the scalability of network analysis algorithms becomes increasingly important. Whereas many standard graph algorithms rely on matrix-vector operations that require exploring the entire graph, this thesis is concerned with graph algorithms that are local (that explore only the graph region near the nodes of interest) as well as the localized behavior of global algorithms. We prove that two well-studied matrix functions for graph analysis, PageRank and the matrix exponential, stay localized on networks that have a skewed degree sequence related to the power-law degree distribution common to many real-world networks. Our results give the first theoretical explanation of a localization phenomenon that has long been observed in real-world networks. We prove our novel method for the matrix exponential converges in sublinear work on graphs with the specified degree sequence, and we adapt our method to produce the first deterministic algorithm for computing the related heat kernel diffusion in constant-time. Finally, we generalize this framework to compute any graph diffusion in constant time

    Sublinear algorithms for local graph centrality estimation

    Get PDF
    We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of mm arcs, with probability (1δ)(1-\delta) computes a multiplicative (1±ϵ)(1\pm\epsilon)-approximation of its score by examining only O~(min(m2/3Δ1/3d2/3,m4/5d3/5))\tilde{O}(\min(m^{2/3} \Delta^{1/3} d^{-2/3},\, m^{4/5} d^{-3/5})) nodes/arcs, where Δ\Delta and dd are respectively the maximum and average outdegree of the graph (omitting for readability poly(ϵ1)\operatorname{poly}(\epsilon^{-1}) and polylog(δ1)\operatorname{polylog}(\delta^{-1}) factors). A similar bound holds for computational complexity. We also prove a lower bound of Ω(min(m1/2Δ1/2d1/2,m2/3d1/3))\Omega(\min(m^{1/2} \Delta^{1/2} d^{-1/2}, \, m^{2/3} d^{-1/3})) for both query complexity and computational complexity. Moreover, our technique yields a O~(n2/3)\tilde{O}(n^{2/3}) query complexity algorithm for the graph access model of [Brautbar et al., 2010], widely used in social network mining; we show this algorithm is optimal up to a sublogarithmic factor. These are the first algorithms yielding worst-case sublinear bounds for general directed graphs and any choice of the target node.Comment: 29 pages, 1 figur

    Neural Distributed Autoassociative Memories: A Survey

    Full text link
    Introduction. Neural network models of autoassociative, distributed memory allow storage and retrieval of many items (vectors) where the number of stored items can exceed the vector dimension (the number of neurons in the network). This opens the possibility of a sublinear time search (in the number of stored items) for approximate nearest neighbors among vectors of high dimension. The purpose of this paper is to review models of autoassociative, distributed memory that can be naturally implemented by neural networks (mainly with local learning rules and iterative dynamics based on information locally available to neurons). Scope. The survey is focused mainly on the networks of Hopfield, Willshaw and Potts, that have connections between pairs of neurons and operate on sparse binary vectors. We discuss not only autoassociative memory, but also the generalization properties of these networks. We also consider neural networks with higher-order connections and networks with a bipartite graph structure for non-binary data with linear constraints. Conclusions. In conclusion we discuss the relations to similarity search, advantages and drawbacks of these techniques, and topics for further research. An interesting and still not completely resolved question is whether neural autoassociative memories can search for approximate nearest neighbors faster than other index structures for similarity search, in particular for the case of very high dimensional vectors.Comment: 31 page

    On Solving Linear Systems in Sublinear Time

    Get PDF
    We study sublinear algorithms that solve linear systems locally. In the classical version of this problem the input is a matrix S in R^{n x n} and a vector b in R^n in the range of S, and the goal is to output x in R^n satisfying Sx=b. For the case when the matrix S is symmetric diagonally dominant (SDD), the breakthrough algorithm of Spielman and Teng [STOC 2004] approximately solves this problem in near-linear time (in the input size which is the number of non-zeros in S), and subsequent papers have further simplified, improved, and generalized the algorithms for this setting. Here we focus on computing one (or a few) coordinates of x, which potentially allows for sublinear algorithms. Formally, given an index u in [n] together with S and b as above, the goal is to output an approximation x^_u for x^*_u, where x^* is a fixed solution to Sx=b. Our results show that there is a qualitative gap between SDD matrices and the more general class of positive semidefinite (PSD) matrices. For SDD matrices, we develop an algorithm that approximates a single coordinate x_{u} in time that is polylogarithmic in n, provided that S is sparse and has a small condition number (e.g., Laplacian of an expander graph). The approximation guarantee is additive | x^_u-x^*_u | 0. We further prove that the condition-number assumption is necessary and tight. In contrast to the SDD matrices, we prove that for certain PSD matrices S, the running time must be at least polynomial in n (for the same additive approximation), even if S has bounded sparsity and condition number
    corecore