634 research outputs found
Modelling Citation Networks
The distribution of the number of academic publications as a function of
citation count for a given year is remarkably similar from year to year. We
measure this similarity as a width of the distribution and find it to be
approximately constant from year to year. We show that simple citation models
fail to capture this behaviour. We then provide a simple three parameter
citation network model using a mixture of local and global search processes
which can reproduce the correct distribution over time. We use the citation
network of papers from the hep-th section of arXiv to test our model. For this
data, around 20% of citations use global information to reference recently
published papers, while the remaining 80% are found using local searches. We
note that this is consistent with other studies though our motivation is very
different from previous work. Finally, we also find that the fluctuations in
the size of an academic publication's bibliography is important for the model.
This is not addressed in most models and needs further work.Comment: 29 pages, 22 figure
A survey of random processes with reinforcement
The models surveyed include generalized P\'{o}lya urns, reinforced random
walks, interacting urn models, and continuous reinforced processes. Emphasis is
on methods and results, with sketches provided of some proofs. Applications are
discussed in statistics, biology, economics and a number of other areas.Comment: Published at http://dx.doi.org/10.1214/07-PS094 in the Probability
Surveys (http://www.i-journals.org/ps/) by the Institute of Mathematical
Statistics (http://www.imstat.org
On the information leakage of differentially-private mechanisms
International audienceDifferential privacy aims at protecting the privacy of participants instatistical databases. Roughly, a mechanism satisfies differential privacy ifthe presence or value of a single individual in the database does notsignificantly change the likelihood of obtaining a certain answer to anystatistical query posed by a data analyst. Differentially-private mechanisms areoften oblivious: first the query is processed on the database to produce a trueanswer, and then this answer is adequately randomized before being reported tothe data analyst. Ideally, a mechanism should minimize leakage, i.e., obfuscateas much as possible the link between reported answers and individuals' data,while maximizing utility, i.e., report answers as similar as possible to thetrue ones. These two goals, however, are in conflict with each other, thusimposing a trade-off between privacy and utility.In this paper we use quantitative information flow principles to analyze leakageand utility in oblivious differentially-private mechanisms. We introduce atechnique that exploits graph symmetries of the adjacency relation on databasesto derive bounds on the min-entropy leakage of the mechanism. We consider anotion of utility based on identity gain functions, which is closely related tomin-entropy leakage, and we derive bounds for it. Finally, given some graphsymmetries, we provide a mechanism that maximizes utility while preserving therequired level of differential privacy
- …