650 research outputs found
The Power Of Locality In Network Algorithms
Over the last decade we have witnessed the rapid proliferation of large-scale complex networks, spanning many social, information and technological domains. While many of the tasks which users of such networks face are essentially global and involve the network as a whole, the size of these networks is huge and the information available to users is only local. In this dissertation we show that even when faced with stringent locality constraints, one can still effectively solve prominent algorithmic problems on such networks.
In the first part of the dissertation we present a natural algorithmic framework designed to model the behaviour of an external agent trying to solve a network optimization problem with limited access to the network data. Our study focuses on local information algorithms --- sequential algorithms where the network topology is initially unknown and is revealed only within a local neighborhood of vertices that have been irrevocably added to the output set. We address both network coverage problems as well as network search problems.
Our results include local information algorithms for coverage problems whose performance closely match the best possible even when information about network structure is unrestricted. We also demonstrate a sharp threshold on the level of visibility required: at a certain visibility level it is possible to design algorithms that nearly match the best approximation possible even with full access to the network structure, but with any less information it is impossible to achieve a reasonable approximation.
For preferential attachment networks, we obtain polylogarithmic approximations to the problem of finding the smallest subgraph that connects a subset of nodes and the problem of finding the highest-degree nodes. This is achieved by addressing a decade-old open question of Bollobás and Riordan on locally finding the root in a preferential attachment process.
In the second part of the dissertation we focus on designing highly time efficient local algorithms for central mining problems on complex networks that have been in the focus of the research community over a decade: finding a small set of influential nodes in the network, and fast ranking of nodes. Among our results is an essentially runtime-optimal local algorithm for the influence maximization problem in the standard independent cascades model of information diffusion and an essentially runtime-optimal local algorithm for the problem of returning all nodes with PageRank bigger than a given threshold.
Our work demonstrates that locality is powerful enough to allow efficient solutions to many central algorithmic problems on complex networks
Online Learning a Binary Labeling of a Graph
We investigate the problem of online learning a binary labeling of the vertices for a given graph. We design an algorithm, Majority, to solve the problem and show its optimality on clique graphs. For general graphs we derive a relevant mistake bound that relates the algorithm’s performance to the cut size (the number of edges between vertices with opposite labeling) and the maximum independent set in the graph. We next introduce a novel complexity measure of the true labeling - the frontier and relate the number of mistakes incurred by Majority to this measure. This allows us to show, in contrast to previous known approaches, that our algorithm works well even when the cut size is bigger than the number of vertices. A detailed comparison with previous results is given
Multi-Scale Matrix Sampling and Sublinear-Time PageRank Computation
A fundamental problem arising in many applications in Web science and social
network analysis is, given an arbitrary approximation factor , to output a
set of nodes that with high probability contains all nodes of PageRank at
least , and no node of PageRank smaller than . We call this
problem {\sc SignificantPageRanks}. We develop a nearly optimal, local
algorithm for the problem with runtime complexity on
networks with nodes. We show that any algorithm for solving this problem
must have runtime of , rendering our algorithm optimal up
to logarithmic factors.
Our algorithm comes with two main technical contributions. The first is a
multi-scale sampling scheme for a basic matrix problem that could be of
interest on its own. In the abstract matrix problem it is assumed that one can
access an unknown {\em right-stochastic matrix} by querying its rows, where the
cost of a query and the accuracy of the answers depend on a precision parameter
. At a cost propositional to , the query will return a
list of entries and their indices that provide an
-precision approximation of the row. Our task is to find a set that
contains all columns whose sum is at least , and omits any column whose
sum is less than . Our multi-scale sampling scheme solves this
problem with cost , while traditional sampling algorithms
would take time .
Our second main technical contribution is a new local algorithm for
approximating personalized PageRank, which is more robust than the earlier ones
developed in \cite{JehW03,AndersenCL06} and is highly efficient particularly
for networks with large in-degrees or out-degrees. Together with our multiscale
sampling scheme we are able to optimally solve the {\sc SignificantPageRanks}
problem.Comment: Accepted to Internet Mathematics journal for publication. An extended
abstract of this paper appeared in WAW 2012 under the title "A Sublinear Time
Algorithm for PageRank Computations
Local Algorithms for Finding Interesting Individuals in Large Networks
We initiate the study of local, sublinear time algorithms for finding vertices with extreme topological properties — such as high degree or clustering coefficient — in large social or other networks. We introduce a new model, called the Jump and Crawl model, in which algorithms are permitted only two graph operations. The Jump operation returns a randomly chosen vertex, and is meant to model the ability to discover “new” vertices via keyword search in the Web, shared hobbies or interests in social networks such as Facebook, and other mechanisms that may return vertices that are distant from all those currently known. The Crawl operation permits an algorithm to explore the neighbors of any currently known vertex, and has clear analogous in many modern networks. We give both upper and lower bounds in the Jump and Crawl model for the problems of finding vertices of high degree and high clustering coefficient. We consider both arbitrary graphs, and specializations in which some common assumptions are made on the global topology (such as power law degree distributions or generation via preferential attachment). We also examine local algorithms for some related vertex or graph properties, and discuss areas for future investigation
Private and Third-Party Randomization in Risk-Sensitive Equilibrium Concepts
We consider risk-sensitive generalizations of Nash and correlated equilibria in noncooperative games. We prove that, except for a class of degenerate games, unless a two-player game has a pure Nash equilibrium, it does not have a risksensitive Nash equilibrium. We also show that every game has a risk-sensitive correlated equilibrium. The striking contrast between these existence results is due to the different sources of randomization in Nash (private randomization) and correlated equilibria (third-party randomization)
Impact of Adding a Single Allele in the 9p21 Locus to Traditional Risk Factors on Reclassification of Coronary Heart Disease Risk and Implications for Lipid-Modifying Therapy in the Atherosclerosis Risk in Communities Study
A single nucleotide polymorphism on chromosome 9p21, rs10757274 (9p21 allele), has been shown to predict coronary heart disease (CHD) in whites. We evaluated whether adding the 9p21 allele to traditional risk factors (RF) improved CHD risk prediction in whites from the Atherosclerosis Risk in Communities (ARIC) study, and whether changes in risk prediction would modify lipid therapy recommendations
Arid1b haploinsufficient mice reveal neuropsychiatric phenotypes and reversible causes of growth impairment.
Sequencing studies have implicated haploinsufficiency of ARID1B, a SWI/SNF chromatin-remodeling subunit, in short stature (Yu et al., 2015), autism spectrum disorder (O'Roak et al., 2012), intellectual disability (Deciphering Developmental Disorders Study, 2015), and corpus callosum agenesis (Halgren et al., 2012). In addition, ARID1B is the most common cause of Coffin-Siris syndrome, a developmental delay syndrome characterized by some of the above abnormalities (Santen et al., 2012; Tsurusaki et al., 2012; Wieczorek et al., 2013). We generated Arid1b heterozygous mice, which showed social behavior impairment, altered vocalization, anxiety-like behavior, neuroanatomical abnormalities, and growth impairment. In the brain, Arid1b haploinsufficiency resulted in changes in the expression of SWI/SNF-regulated genes implicated in neuropsychiatric disorders. A focus on reversible mechanisms identified Insulin-like growth factor (IGF1) deficiency with inadequate compensation by Growth hormone-releasing hormone (GHRH) and Growth hormone (GH), underappreciated findings in ARID1B patients. Therapeutically, GH supplementation was able to correct growth retardation and muscle weakness. This model functionally validates the involvement of ARID1B in human disorders, and allows mechanistic dissection of neurodevelopmental diseases linked to chromatin-remodeling
ASSOCIATIONS BETWEEN A SINGLE NUCLEOTIDE POLYMORPHISM IN CHROMOSOME 9P21 AND ARTERIAL STIFFNESS IN THE ATHEROSCLEROSIS RISK IN COMMUNITIES STUDY
- …