634 research outputs found

    Modelling Citation Networks

    Full text link
    The distribution of the number of academic publications as a function of citation count for a given year is remarkably similar from year to year. We measure this similarity as a width of the distribution and find it to be approximately constant from year to year. We show that simple citation models fail to capture this behaviour. We then provide a simple three parameter citation network model using a mixture of local and global search processes which can reproduce the correct distribution over time. We use the citation network of papers from the hep-th section of arXiv to test our model. For this data, around 20% of citations use global information to reference recently published papers, while the remaining 80% are found using local searches. We note that this is consistent with other studies though our motivation is very different from previous work. Finally, we also find that the fluctuations in the size of an academic publication's bibliography is important for the model. This is not addressed in most models and needs further work.Comment: 29 pages, 22 figure

    A survey of random processes with reinforcement

    Full text link
    The models surveyed include generalized P\'{o}lya urns, reinforced random walks, interacting urn models, and continuous reinforced processes. Emphasis is on methods and results, with sketches provided of some proofs. Applications are discussed in statistics, biology, economics and a number of other areas.Comment: Published at http://dx.doi.org/10.1214/07-PS094 in the Probability Surveys (http://www.i-journals.org/ps/) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the information leakage of differentially-private mechanisms

    Get PDF
    International audienceDifferential privacy aims at protecting the privacy of participants instatistical databases. Roughly, a mechanism satisfies differential privacy ifthe presence or value of a single individual in the database does notsignificantly change the likelihood of obtaining a certain answer to anystatistical query posed by a data analyst. Differentially-private mechanisms areoften oblivious: first the query is processed on the database to produce a trueanswer, and then this answer is adequately randomized before being reported tothe data analyst. Ideally, a mechanism should minimize leakage, i.e., obfuscateas much as possible the link between reported answers and individuals' data,while maximizing utility, i.e., report answers as similar as possible to thetrue ones. These two goals, however, are in conflict with each other, thusimposing a trade-off between privacy and utility.In this paper we use quantitative information flow principles to analyze leakageand utility in oblivious differentially-private mechanisms. We introduce atechnique that exploits graph symmetries of the adjacency relation on databasesto derive bounds on the min-entropy leakage of the mechanism. We consider anotion of utility based on identity gain functions, which is closely related tomin-entropy leakage, and we derive bounds for it. Finally, given some graphsymmetries, we provide a mechanism that maximizes utility while preserving therequired level of differential privacy
    • …
    corecore