3,883 research outputs found

    Benford's Law, Values of L-functions and the 3x+1 Problem

    Full text link
    We show the leading digits of a variety of systems satisfying certain conditions follow Benford's Law. For each system proving this involves two main ingredients. One is a structure theorem of the limiting distribution, specific to the system. The other is a general technique of applying Poisson Summation to the limiting distribution. We show the distribution of values of L-functions near the central line and (in some sense) the iterates of the 3x+1 Problem are Benford.Comment: 25 pages, 1 figure; replacement of earlier draft (corrected some typos, added more exposition, added results for characteristic polynomials of unitary matrices

    Monte Carlo Methods for Top-k Personalized PageRank Lists and Name Disambiguation

    Get PDF
    We study a problem of quick detection of top-k Personalized PageRank lists. This problem has a number of important applications such as finding local cuts in large graphs, estimation of similarity distance and name disambiguation. In particular, we apply our results to construct efficient algorithms for the person name disambiguation problem. We argue that when finding top-k Personalized PageRank lists two observations are important. Firstly, it is crucial that we detect fast the top-k most important neighbours of a node, while the exact order in the top-k list as well as the exact values of PageRank are by far not so crucial. Secondly, a little number of wrong elements in top-k lists do not really degrade the quality of top-k lists, but it can lead to significant computational saving. Based on these two key observations we propose Monte Carlo methods for fast detection of top-k Personalized PageRank lists. We provide performance evaluation of the proposed methods and supply stopping criteria. Then, we apply the methods to the person name disambiguation problem. The developed algorithm for the person name disambiguation problem has achieved the second place in the WePS 2010 competition

    Combining domain knowledge and statistical models in time series analysis

    Full text link
    This paper describes a new approach to time series modeling that combines subject-matter knowledge of the system dynamics with statistical techniques in time series analysis and regression. Applications to American option pricing and the Canadian lynx data are given to illustrate this approach.Comment: Published at http://dx.doi.org/10.1214/074921706000001049 in the IMS Lecture Notes Monograph Series (http://www.imstat.org/publications/lecnotes.htm) by the Institute of Mathematical Statistics (http://www.imstat.org

    On the variance of the number of occupied boxes

    Full text link
    We consider the occupancy problem where balls are thrown independently at infinitely many boxes with fixed positive frequencies. It is well known that the random number of boxes occupied by the first n balls is asymptotically normal if its variance V_n tends to infinity. In this work, we mainly focus on the opposite case where V_n is bounded, and derive a simple necessary and sufficient condition for convergence of V_n to a finite limit, thus settling a long-standing question raised by Karlin in the seminal paper of 1967. One striking consequence of our result is that the possible limit may only be a positive integer number. Some new conditions for other types of behavior of the variance, like boundedness or convergence to infinity, are also obtained. The proofs are based on the poissonization techniques.Comment: 34 page

    Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator

    Get PDF
    When an unbiased estimator of the likelihood is used within a Metropolis--Hastings chain, it is necessary to trade off the number of Monte Carlo samples used to construct this estimator against the asymptotic variances of averages computed under this chain. Many Monte Carlo samples will typically result in Metropolis--Hastings averages with lower asymptotic variances than the corresponding Metropolis--Hastings averages using fewer samples. However, the computing time required to construct the likelihood estimator increases with the number of Monte Carlo samples. Under the assumption that the distribution of the additive noise introduced by the log-likelihood estimator is Gaussian with variance inversely proportional to the number of Monte Carlo samples and independent of the parameter value at which it is evaluated, we provide guidelines on the number of samples to select. We demonstrate our results by considering a stochastic volatility model applied to stock index returns.Comment: 34 pages, 9 figures, 3 table

    Assessing consistency of fish survey data : uncertainties in the estimation of mackerel icefish (Champsocephalus gunnari) abundance at South Georgia

    Get PDF
    Acknowledgments The authors wish to thank the crews, fishermen and scientists who conducted the various surveys from which data were obtained, and Mark Belchier and Simeon Hill for their contributions. This work was supported by the Government of South Georgia and South Sandwich Islands. Additional logistical support provided by The South Atlantic Environmental Research Institute with thanks to Paul Brickle. Thanks to Stephen Smith of Fisheries and Oceans Canada (DFO) for help in constructing bootstrap confidence limits. Paul Fernandes receives funding from the MASTS pooling initiative (The Marine Alliance for Science and Technology for Scotland), and their support is gratefully acknowledged. MASTS is funded by the Scottish Funding Council (grant reference HR09011) and contributing institutions. We also wish to thank two anonymous referees for their helpful suggestions on earlier versions of this manuscript.Peer reviewedPostprin
    • 

    corecore