3,110 research outputs found

    Multiresolution community detection for megascale networks by information-based replica correlations

    Full text link
    We use a Potts model community detection algorithm to accurately and quantitatively evaluate the hierarchical or multiresolution structure of a graph. Our multiresolution algorithm calculates correlations among multiple copies ("replicas") of the same graph over a range of resolutions. Significant multiresolution structures are identified by strongly correlated replicas. The average normalized mutual information, the variation of information, and other measures in principle give a quantitative estimate of the "best" resolutions and indicate the relative strength of the structures in the graph. Because the method is based on information comparisons, it can in principle be used with any community detection model that can examine multiple resolutions. Our approach may be extended to other optimization problems. As a local measure, our Potts model avoids the "resolution limit" that affects other popular models. With this model, our community detection algorithm has an accuracy that ranks among the best of currently available methods. Using it, we can examine graphs over 40 million nodes and more than one billion edges. We further report that the multiresolution variant of our algorithm can solve systems of at least 200000 nodes and 10 million edges on a single processor with exceptionally high accuracy. For typical cases, we find a super-linear scaling, O(L^{1.3}) for community detection and O(L^{1.3} log N) for the multiresolution algorithm where L is the number of edges and N is the number of nodes in the system.Comment: 19 pages, 14 figures, published version with minor change

    Distributed Quantum Proofs for Replicated Data

    Get PDF
    This paper tackles the issue of checking that all copies of a large data set replicated at several nodes of a network are identical. The fact that the replicas may be located at distant nodes prevents the system from verifying their equality locally, i.e., by having each node consult only nodes in its vicinity. On the other hand, it remains possible to assign certificates to the nodes, so that verifying the consistency of the replicas can be achieved locally. However, we show that, as the replicated data is large, classical certification mechanisms, including distributed Merlin-Arthur protocols, cannot guarantee good completeness and soundness simultaneously, unless they use very large certificates. The main result of this paper is a distributed quantum Merlin-Arthur protocol enabling the nodes to collectively check the consistency of the replicas, based on small certificates, and in a single round of message exchange between neighbors, with short messages. In particular, the certificate-size is logarithmic in the size of the data set, which gives an exponential advantage over classical certification mechanisms. We propose yet another usage of a fundamental quantum primitive, called the SWAP test, in order to show our main result

    Service discovery in a peer-to-peer environment for computational grids

    Get PDF
    Les grilles de calcul sont des systÚmes distribués dont l'objectif est l'agrégation et le partage de ressources hétérogÚnes géographiquement réparties pour le calcul haute performance. Les services d'une grille sont l'ensemble des applicatifs que des serveurs mettent à disposition des clients. Une problématique largement soulevée par les utilisateurs de grilles est la découverte de services. Les mécanismes actuels de découverte de services manquent de fonctionnalités et deviennent inefficaces dans des environnements dynamiques à large échelle. Il est donc indispensable de proposer de nouveaux outils pour de tels environnements. Les technologies pair-à-pair émergentes fournissent des algorithmes permettant une décentralisation totale de la construction et de la maintenance de systÚmes distribués performants et tolérants aux pannes. Le problÚme que l'on cherche à résoudre est de permettre une découverte flexible (recherche multicritÚres, complétion automatique) des services dans des grilles prenant place dans un environnement dynamique à large échelle (pair-à-pair) en tenant compte de la topologie du réseau physique sous-jacent

    Local multiresolution order in community detection

    Full text link
    Community detection algorithms attempt to find the best clusters of nodes in an arbitrary complex network. Multi-scale ("multiresolution") community detection extends the problem to identify the best network scale(s) for these clusters. The latter task is generally accomplished by analyzing community stability simultaneously for all clusters in the network. In the current work, we extend this general approach to define local multiresolution methods, which enable the extraction of well-defined local communities even if the global community structure is vaguely defined in an average sense. Toward this end, we propose measures analogous to variation of information and normalized mutual information that are used to quantitatively identify the best resolution(s) at the community level based on correlations between clusters in independently-solved systems. We demonstrate our method on two constructed networks as well as a real network and draw inferences about local community strength. Our approach is independent of the applied community detection algorithm save for the inherent requirement that the method be able to identify communities across different network scales, with appropriate changes to account for how different resolutions are evaluated or defined in a particular community detection method. It should, in principle, easily adapt to alternative community comparison measures.Comment: 19 pages, 11 figure

    Using Permuted States of Validated Simulation to Analyze Conflict Rates in Optimistic Replication

    Get PDF
    Optimistic replication provides high data availability in the presence of network outages. Although widely deployed, this relaxed consistency model introduces concurrent updates, whose behavior is poorly understood due to the vast state space. This paper introduces the notion of permuted states to eliminate system states that are redundant and unreachable, which can constitute the majority of states (4069 out of 4096 for four replicas). With the aid of permuted states, we are for the first time able to construct analytical models beyond the two-replica case. By examining the analysis for 2 to 4 replicas, we can demystify the process of forming identical conflicts—the most common conflict type at high replication factors. Additionally, we have automated and optimized the generation of permuted states, which allows us to explore higher replication factors (up to 10 replicas) using hybrid techniques. It also allows us to validate our results with existing simulations based on actual replication mechanisms, which previously were analytically validated with only one pair of replicas. Finally, we have discovered that update locality and bimodal access patterns are the primary factors contributing to the formation of identical conflicts

    Model-Based Dynamic Resource Management for Service Oriented Clouds

    Get PDF
    Cloud computing is a flexible platform for software as a service, as more and more applications are deployed on cloud. Major challenges in cloud include how to characterize the workload of the applications and how to manage the cloud resources efficiently by sharing them among many applications. The current state of the art considers a simplified model of the system, either ignoring the software components altogether or ignoring the relationship between individual software services. This thesis considers the following resource management problems for cloud-based service providers: (i) how to estimate the parameters of the current workload, (ii) how to meet Quality of Service (QoS) targets while minimizing infrastructure cost, (iii) how to allocate resources considering performance costs of virtual machine reconfigurations. To address the above problems, we propose a model-based feedback loop approach. The cloud infrastructure, the services, and the applications are modelled using Layered Queuing Models (LQM). These models are then optimized. Mathematical techniques are used to reduce the complexity of the models and address the scalability issues. The main contributions of this thesis are: (i) Extended Kalman Filter (EKF) based techniques improved by dynamic clustering for scalable estimation of workload parameters, (ii) combination of adaptive empirical models (tuned during runtime) and stepwise optimizations for improving the overall allocation performance, (iii) dynamic service placement algorithms that consider the cost of virtual machine reconfiguration

    Dynamic Maintenance of Majority Information in Constant Time per Update

    Get PDF
    We show how to maintain information about the existence of amajority colour in a set of elements under insertion and deletion ofsingle elements using O(1) time and at most 4 equality tests on coloursper update. No ordering information is used
    • 

    corecore