56 research outputs found

    A high-level and scalable approach for generating scale-free graphs using active objects

    Get PDF
    The Barabasi-Albert model (BA) is designed to generate scale-free networks using the preferential attachment mechanism. In the preferential attachment (PA) model, new nodes are sequentially introduced to the network and they attach preferentially to existing nodes. PA is a classical model with a natural intuition, great explanatory power and a simple mechanism. Therefore, PA is widely-used for network generation. However the sequential mechanism used in the PA model makes it an inefficient algorithm. The existing parallel approaches, on the other hand, suffer from either changing the original model or explicit complex low-level synchronization mechanisms. In this paper we investigate a high-level Actor-based model of the parallel algorithm of network generation and its scalable multicore implementation in Haskell

    Asynchronous programming in the abstract behavioural specification language

    Get PDF
    Chip manufacturers are rapidly moving towards so-called manycore chips with thousands of independent processors on the same silicon real estate. Current programming languages can only leverage the potential power by inserting code with low level concurrency constructs, sacrificing clarity. Alternatively, a programming language can integrate a thread of execution with a stable notion of identity, e.g., in active objects.Abstract Behavioural Specification (ABS) is a language for designing executable models of parallel and distributed object-oriented systems based on active objects, and is defined in terms of a formal operational semantics which enables a variety of static and dynamic analysis techniques for the ABS models.The overall goal of this thesis is to extend the asynchronous programming model and the corresponding analysis techniques in ABS.Algorithms and the Foundations of Software technolog

    Topology and dynamics of an artificial genetic regulatory network model

    Get PDF
    This thesis presents some of the methods of studying models of regulatory networks using mathematical and computational formalisms. A basic review of the biology behind gene regulation is introduced along with the formalisms used for modelling networks of such regulatory interactions. Topological measures of large-scale complex networks are discussed and then applied to a specific artificial regulatory network model created through a duplication and divergence mechanism. Such networks share topological features with natural transcriptional regulatory networks. Thus, it may be the case that the topologies inherent in natural networks may be primarily due to their method of creation rather than being exclusively shaped by subsequent evolution under selection. The evolvability of the dynamics of these networks are also examined by evolving networks in simulation to obtain three simple types of output dynamics. The networks obtained from this process show a wide variety of topologies and numbers of genes indicating that it is relatively easy to evolve these classes of dynamics in this model

    Asynchronous Programming in the Abstract Behavioural Specification Language

    Get PDF
    Chip manufacturers are rapidly moving towards so-called manycore chips with thousands of independent processors on the same silicon real estate. Current programming languages can only leverage the potential power by inserting code with low level concurrency constructs, sacrificing clarity. Alternatively, a programming language can integrate a thread of execution with a stable notion of identity, e.g., in active objects.Abstract Behavioural Specification (ABS) is a language for designing executable models of parallel and distributed object-oriented systems based on active objects, and is defined in terms of a formal operational semantics which enables a variety of static and dynamic analysis techniques for the ABS models.The overall goal of this thesis is to extend the asynchronous programming model and the corresponding analysis techniques in ABS.Algorithms and the Foundations of Software technolog

    Analyzing and Processing Big Real Graphs

    Get PDF
    As fundamental abstractions of network structures, graphs are everywhere, ranging from biological protein interaction networks and Internet routing networks, to emerging online social networks. Studying graphs is critical to understanding the fundamental processes behind the networks, and of practical importance in experimental research. Although many studies on graphs have been carried out in decades, most of the work focused on small or synthetic graphs. In recent years, because of the unprecedented increase of existing networks and the emergence of new complex networks, more and more big real graphs are becoming available. Compared to the graphs studied in prior work, the graphs from these networks are significantly different in scale, level of dynamics and structure.In this dissertation, we tackle three important graph research problems caused by the significant differences of the big real graphs: efficient node distance computation, graph dynamic analysis and modeling, and graph privacy.First, we target on a fundamental graph analysis problem, i.e. node distance computation. As a primitive of graph analysis and network applications, the computation of shortest path or random walk distances is computationally expensive, and difficult to scale with the sheer size of big real graphs. To address the scalability issue, we design a novel node distance computation method, named graph coordinate systems, to efficiently estimate node distances with high accuracy.Our second work is to understand and model the dynamic processes in big real graphs. Specifically, we propose methods to analyze graph dynamics at multiple network scales and explore temporal properties of network growth. Through measurements on Renren first two-year dynamic data, we find independent and predictable processes at different network levels, and detect self-similar properties in its edge creation process. Based on the observations, we propose a new dynamic graph model to capture both temporal and spatial properties. Calibrated with the Renren dataset, our model successfully produces synthetic graphs showing similar dynamic properties.Finally, to address privacy issue in sharing graphs, we design a graph privacy system to guarantee the required level of privacy. The goal of our work is to design a system that can both maintain a meaningful graph structure and provide strong privacy guarantee. To navigate the tradeoff between the strength of privacy and graph structure utility, we propose a differentially-private graph model. Our rigorous proof shows that the graphs produced by the system can achieve the required level of privacy. By running the system on real graphs collected from Facebook, Internet, and Web, the results demonstrate that the generated synthetic graphs match the original graphs in terms of graph structural metrics and application-level performance.In summary, to analyze and process the graphs from today's large complex networks, we work on three important problems, including efficiently computing node distances in massive graphs, analyzing and modeling high volume of dynamics in big real graphs, and protecting graph privacy in sharing graphs. We propose novel solutions to address these problems. Through our extensive experiments, we show that our designs perform consistently well on big real graphs

    Sampling designs and robustness for the analysis of network data

    Get PDF
    This manuscript addresses three new practical methodologies for topics on Bayesian analysis regarding sampling designs and robustness on network data: / In the first part of this thesis we propose a general approach for comparing sampling designs. The approach is based on the concept of data compression from information theory. The criterion for comparing sampling designs is formulated so that the results prove to be robust with respect to some of the most widely used loss functions for point estimation and prediction. The rationale behind the proposed approach is to find sampling designs such that preserve the largest amount of information possible from the original data generating mechanism. The approach is inspired by the same principle as the reference prior, with the difference that, for the proposed approach, the argument of the optimization is the sampling design rather than the prior. The information contained in the data generating mechanism can be encoded in a distribution defined either in parameter’s space (posterior distribution) or in the space of observables (predictive distribution). The results obtained in this part enable us to relate statements about a feature of an observed subgraph and a feature of a full graph. It is proven that such statements can not be connected by invoking conditional statements only; it is necessary to specify a joint distribution for the random graph model and the sampling design for all values of fully and partially observed random network features. We use this rationale to formulate statements at the level of the sampling graph that help to make non-trivial statements about the full network. The joint distribution of the underlying network and the sampling mechanism enable the statistician to relate both type of conditional statements. Thus, for random network partially and fully observed features joint distribution is considered and useful statements for practitioners are provided. / The second general theme of this thesis is robustness on networks. A method for robustness on exchangeable random networks is developed. The approach is inspired by the concept of graphon approximation through a stochastic block model. An exchangeable model is assumed to infer a feature of a random networks with the objective to see how the quality of that inference gets degraded if the model is slightly modified. Decision theory methods are considered under model misspecification by quantifying stability of optimal actions to perturbations to the approximating model within a well defined neighborhood of model space. The approach is inspired by all recent developments across the context of robustness in recent research in the robust control, macroeconomics and financial mathematics literature. / In all topics, simulation analysis is complemented with comprehensive experimental studies, which show the benefits of our modeling and estimation methods

    Analysis of category co-occurrence in Wikipedia networks

    Get PDF
    Wikipedia has seen a huge expansion of content since its inception. Pages within this online encyclopedia are organised by assigning them to one or more categories, where Wikipedia maintains a manually constructed taxonomy graph that encodes the semantic relationship between these categories. An alternative, called the category co-occurrence graph, can be produced automatically by linking together categories that have pages in common. Properties of the latter graph and its relationship to the former is the concern of this thesis. The analytic framework, called t-component, is introduced to formalise the graphs and discover category clusters connecting relevant categories together. The m-core, a cohesive subgroup concept as a clustering model, is used to construct a subgraph depending on the number of shared pages between the categories exceeding a given threshold t. The significant of the clustering result of the m-core is validated using a permutation test. This is compared to the k-core, another clustering model. TheWikipedia category co-occurrence graphs are scale-free with a few category hubs and the majority of clusters are size 2. All observed properties for the distribution of the largest clusters of the category graphs obey power-laws with decay exponent averages around 1. As the threshold t of the number of shared pages is increased, eventually a critical threshold is reached when the largest cluster shrinks significantly in size. This phenomena is only exhibited for the m-core but not the k-core. Lastly, the clustering in the category graph is shown to be consistent with the distance between categories in the taxonomy graph
    • …
    corecore