63 research outputs found

    The Zipf-Polylog distribution: Modeling human interactions through social networks

    Get PDF
    The Zipf distribution attracts considerable attention because it helps describe data from natural as well as man-made systems. Nevertheless, in most of the cases the Zipf is only appropriate to fit data in the upper tail. This is why it is important to dispose of Zipf extensions that allow to fit the data in its entire range. In this paper, we introduce the Zipf-Polylog family of distributions as a two-parameter generalization of the Zipf. The extended family contains the Zipf, the geometric, the logarithmic series and the shifted negative binomial with two successes, as particular distributions. We deduce important properties of the new family and demonstrate its suitability by analyzing the degree sequence of two real networks in all its range.Peer ReviewedPostprint (author's final draft

    Pareto and Zipf laws for city size distribution

    Get PDF
    Pareto and Zipf distributions have been used in the modeling of distinct phenomena, namely in biology, demography, computer science, economics, amongst others. In this paper, it is presented a short review of applications of these distributions in city sizes.N/

    A review of power laws in real life phenomena

    Get PDF
    Power law distributions, also known as heavy tail distributions, model distinct real life phenomena in the areas of biology, demography, computer science, economics, information theory, language, and astronomy, amongst others. In this paper, it is presented a review of the literature having in mind applications and possible explanations for the use of power laws in real phenomena. We also unravel some controversies around power laws

    MGG: Accelerating Graph Neural Networks with Fine-grained intra-kernel Communication-Computation Pipelining on Multi-GPU Platforms

    Full text link
    The increasing size of input graphs for graph neural networks (GNNs) highlights the demand for using multi-GPU platforms. However, existing multi-GPU GNN systems optimize the computation and communication individually based on the conventional practice of scaling dense DNNs. For irregularly sparse and fine-grained GNN workloads, such solutions miss the opportunity to jointly schedule/optimize the computation and communication operations for high-performance delivery. To this end, we propose MGG, a novel system design to accelerate full-graph GNNs on multi-GPU platforms. The core of MGG is its novel dynamic software pipeline to facilitate fine-grained computation-communication overlapping within a GPU kernel. Specifically, MGG introduces GNN-tailored pipeline construction and GPU-aware pipeline mapping to facilitate workload balancing and operation overlapping. MGG also incorporates an intelligent runtime design with analytical modeling and optimization heuristics to dynamically improve the execution performance. Extensive evaluation reveals that MGG outperforms state-of-the-art full-graph GNN systems across various settings: on average 4.41X, 4.81X, and 10.83X faster than DGL, MGG-UVM, and ROC, respectively

    A Comparative Study Between Two Discrete Lindley Distributions

    Get PDF
    The methods of generate a probability function from a probability density function has long been used in recent years. In general, the discretization process produces probability functions that can be rivals to traditional distributions used in the analysis of count data as the geometric, the Poisson and negative binomial distributions. In this paper, by the method based on an infinite series, we studied an alternative discrete Lindley distribution to those study in Gomez (2011) and Bakouch (2014). For both distributions, a simulation study is carried out to examine the bias and mean squared error of the maximum likelihood estimators of the parameters as well as the coverage probability and the width of the confidence intervals. For the discrete Lindley distribution obtained by infinite series method we present the analytical expression for bias reduction of the maximum likelihood estimator. Some examples using real data from the literature show the potential of these distributions.

    A Behavior-Based Approach To Securing Email Systems

    Get PDF
    The Malicious Email Tracking (MET) system, reported in a prior publication, is a behavior-based security system for email services. The Email Mining Toolkit (EMT) presented in this paper is an offline email archive data mining analysis system that is designed to assist computing models of malicious email behavior for deployment in an online MET system. EMT includes a variety of behavior models for email attachments, user accounts and groups of accounts. Each model computed is used to detect anomalous and errant email behaviors. We report on the set of features implemented in the current version of EMT, and describe tests of the system and our plans for extensions to the set of models
    corecore