650 research outputs found

    The Power Of Locality In Network Algorithms

    Get PDF
    Over the last decade we have witnessed the rapid proliferation of large-scale complex networks, spanning many social, information and technological domains. While many of the tasks which users of such networks face are essentially global and involve the network as a whole, the size of these networks is huge and the information available to users is only local. In this dissertation we show that even when faced with stringent locality constraints, one can still effectively solve prominent algorithmic problems on such networks. In the first part of the dissertation we present a natural algorithmic framework designed to model the behaviour of an external agent trying to solve a network optimization problem with limited access to the network data. Our study focuses on local information algorithms --- sequential algorithms where the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. We address both network coverage problems as well as network search problems. Our results include local information algorithms for coverage problems whose performance closely match the best possible even when information about network structure is unrestricted. We also demonstrate a sharp threshold on the level of visibility required: at a certain visibility level it is possible to design algorithms that nearly match the best approximation possible even with full access to the network structure, but with any less information it is impossible to achieve a reasonable approximation. For preferential attachment networks, we obtain polylogarithmic approximations to the problem of finding the smallest subgraph that connects a subset of nodes and the problem of finding the highest-degree nodes. This is achieved by addressing a decade-old open question of Bollobás and Riordan on locally finding the root in a preferential attachment process. In the second part of the dissertation we focus on designing highly time efficient local algorithms for central mining problems on complex networks that have been in the focus of the research community over a decade: finding a small set of influential nodes in the network, and fast ranking of nodes. Among our results is an essentially runtime-optimal local algorithm for the influence maximization problem in the standard independent cascades model of information diffusion and an essentially runtime-optimal local algorithm for the problem of returning all nodes with PageRank bigger than a given threshold. Our work demonstrates that locality is powerful enough to allow efficient solutions to many central algorithmic problems on complex networks

    Online Learning a Binary Labeling of a Graph

    Get PDF
    We investigate the problem of online learning a binary labeling of the vertices for a given graph. We design an algorithm, Majority, to solve the problem and show its optimality on clique graphs. For general graphs we derive a relevant mistake bound that relates the algorithm’s performance to the cut size (the number of edges between vertices with opposite labeling) and the maximum independent set in the graph. We next introduce a novel complexity measure of the true labeling - the frontier and relate the number of mistakes incurred by Majority to this measure. This allows us to show, in contrast to previous known approaches, that our algorithm works well even when the cut size is bigger than the number of vertices. A detailed comparison with previous results is given

    Multi-Scale Matrix Sampling and Sublinear-Time PageRank Computation

    Full text link
    A fundamental problem arising in many applications in Web science and social network analysis is, given an arbitrary approximation factor c>1c>1, to output a set SS of nodes that with high probability contains all nodes of PageRank at least Δ\Delta, and no node of PageRank smaller than Δ/c\Delta/c. We call this problem {\sc SignificantPageRanks}. We develop a nearly optimal, local algorithm for the problem with runtime complexity O~(n/Δ)\tilde{O}(n/\Delta) on networks with nn nodes. We show that any algorithm for solving this problem must have runtime of Ω(n/Δ){\Omega}(n/\Delta), rendering our algorithm optimal up to logarithmic factors. Our algorithm comes with two main technical contributions. The first is a multi-scale sampling scheme for a basic matrix problem that could be of interest on its own. In the abstract matrix problem it is assumed that one can access an unknown {\em right-stochastic matrix} by querying its rows, where the cost of a query and the accuracy of the answers depend on a precision parameter ϵ\epsilon. At a cost propositional to 1/ϵ1/\epsilon, the query will return a list of O(1/ϵ)O(1/\epsilon) entries and their indices that provide an ϵ\epsilon-precision approximation of the row. Our task is to find a set that contains all columns whose sum is at least Δ\Delta, and omits any column whose sum is less than Δ/c\Delta/c. Our multi-scale sampling scheme solves this problem with cost O~(n/Δ)\tilde{O}(n/\Delta), while traditional sampling algorithms would take time Θ((n/Δ)2)\Theta((n/\Delta)^2). Our second main technical contribution is a new local algorithm for approximating personalized PageRank, which is more robust than the earlier ones developed in \cite{JehW03,AndersenCL06} and is highly efficient particularly for networks with large in-degrees or out-degrees. Together with our multiscale sampling scheme we are able to optimally solve the {\sc SignificantPageRanks} problem.Comment: Accepted to Internet Mathematics journal for publication. An extended abstract of this paper appeared in WAW 2012 under the title "A Sublinear Time Algorithm for PageRank Computations

    Local Algorithms for Finding Interesting Individuals in Large Networks

    Get PDF
    We initiate the study of local, sublinear time algorithms for finding vertices with extreme topological properties — such as high degree or clustering coefficient — in large social or other networks. We introduce a new model, called the Jump and Crawl model, in which algorithms are permitted only two graph operations. The Jump operation returns a randomly chosen vertex, and is meant to model the ability to discover “new” vertices via keyword search in the Web, shared hobbies or interests in social networks such as Facebook, and other mechanisms that may return vertices that are distant from all those currently known. The Crawl operation permits an algorithm to explore the neighbors of any currently known vertex, and has clear analogous in many modern networks. We give both upper and lower bounds in the Jump and Crawl model for the problems of finding vertices of high degree and high clustering coefficient. We consider both arbitrary graphs, and specializations in which some common assumptions are made on the global topology (such as power law degree distributions or generation via preferential attachment). We also examine local algorithms for some related vertex or graph properties, and discuss areas for future investigation

    Private and Third-Party Randomization in Risk-Sensitive Equilibrium Concepts

    Get PDF
    We consider risk-sensitive generalizations of Nash and correlated equilibria in noncooperative games. We prove that, except for a class of degenerate games, unless a two-player game has a pure Nash equilibrium, it does not have a risksensitive Nash equilibrium. We also show that every game has a risk-sensitive correlated equilibrium. The striking contrast between these existence results is due to the different sources of randomization in Nash (private randomization) and correlated equilibria (third-party randomization)

    Impact of Adding a Single Allele in the 9p21 Locus to Traditional Risk Factors on Reclassification of Coronary Heart Disease Risk and Implications for Lipid-Modifying Therapy in the Atherosclerosis Risk in Communities Study

    Get PDF
    A single nucleotide polymorphism on chromosome 9p21, rs10757274 (9p21 allele), has been shown to predict coronary heart disease (CHD) in whites. We evaluated whether adding the 9p21 allele to traditional risk factors (RF) improved CHD risk prediction in whites from the Atherosclerosis Risk in Communities (ARIC) study, and whether changes in risk prediction would modify lipid therapy recommendations

    Arid1b haploinsufficient mice reveal neuropsychiatric phenotypes and reversible causes of growth impairment.

    Get PDF
    Sequencing studies have implicated haploinsufficiency of ARID1B, a SWI/SNF chromatin-remodeling subunit, in short stature (Yu et al., 2015), autism spectrum disorder (O'Roak et al., 2012), intellectual disability (Deciphering Developmental Disorders Study, 2015), and corpus callosum agenesis (Halgren et al., 2012). In addition, ARID1B is the most common cause of Coffin-Siris syndrome, a developmental delay syndrome characterized by some of the above abnormalities (Santen et al., 2012; Tsurusaki et al., 2012; Wieczorek et al., 2013). We generated Arid1b heterozygous mice, which showed social behavior impairment, altered vocalization, anxiety-like behavior, neuroanatomical abnormalities, and growth impairment. In the brain, Arid1b haploinsufficiency resulted in changes in the expression of SWI/SNF-regulated genes implicated in neuropsychiatric disorders. A focus on reversible mechanisms identified Insulin-like growth factor (IGF1) deficiency with inadequate compensation by Growth hormone-releasing hormone (GHRH) and Growth hormone (GH), underappreciated findings in ARID1B patients. Therapeutically, GH supplementation was able to correct growth retardation and muscle weakness. This model functionally validates the involvement of ARID1B in human disorders, and allows mechanistic dissection of neurodevelopmental diseases linked to chromatin-remodeling
    corecore