79,112 research outputs found

    A Beta-splitting model for evolutionary trees

    Full text link
    In this article, we construct a generalization of the Blum-Fran\c{c}ois Beta-splitting model for evolutionary trees, which was itself inspired by Aldous' Beta-splitting model on cladograms. The novelty of our approach allows for asymmetric shares of diversification rates (or diversification `potential') between two sister species in an evolutionarily interpretable manner, as well as the addition of extinction to the model in a natural way. We describe the incremental evolutionary construction of a tree with n leaves by splitting or freezing extant lineages through the Generating, Organizing and Deleting processes. We then give the probability of any (binary rooted) tree under this model with no extinction, at several resolutions: ranked planar trees giving asymmetric roles to the first and second offspring species of a given species and keeping track of the order of the speciation events occurring during the creation of the tree, unranked planar trees, ranked non-planar trees and finally (unranked non-planar) trees. We also describe a continuous-time equivalent of the Generating, Organizing and Deleting processes where tree topology and branch-lengths are jointly modeled and provide code in SageMath/python for these algorithms.Comment: 23 pages, 3 figures, 1 tabl

    Scalable Probabilistic Similarity Ranking in Uncertain Databases (Technical Report)

    Get PDF
    This paper introduces a scalable approach for probabilistic top-k similarity ranking on uncertain vector data. Each uncertain object is represented by a set of vector instances that are assumed to be mutually-exclusive. The objective is to rank the uncertain data according to their distance to a reference object. We propose a framework that incrementally computes for each object instance and ranking position, the probability of the object falling at that ranking position. The resulting rank probability distribution can serve as input for several state-of-the-art probabilistic ranking models. Existing approaches compute this probability distribution by applying a dynamic programming approach of quadratic complexity. In this paper we theoretically as well as experimentally show that our framework reduces this to a linear-time complexity while having the same memory requirements, facilitated by incremental accessing of the uncertain vector instances in increasing order of their distance to the reference object. Furthermore, we show how the output of our method can be used to apply probabilistic top-k ranking for the objects, according to different state-of-the-art definitions. We conduct an experimental evaluation on synthetic and real data, which demonstrates the efficiency of our approach

    The Life-Cycle Income Analysis Model (LIAM): a study of a flexible dynamic microsimulation modelling computing framework

    Get PDF
    This paper describes a flexible computing framework designed to create a dynamic microsimulation model, the Life-cycle Income Analysis Model (LIAM). The principle computing characteristics include the degree of modularisation, parameterisation, generalisation and robustness. The paper describes the decisions taken with regard to type of dynamic model used. The LIAM framework has been used to create a number of different microsimulation models, including an Irish dynamic cohort model, a spatial dynamic microsimulation model for Ireland, an indirect tax and consumption model for EU15 as part of EUROMOD and a prototype EU dynamic population microsimulation model for 5 EU countries. Particular consideration is given to issues of parameterisation, alignment and computational efficiency.flexible; modular; dynamic; alignment; parameterisation; computational efficiency

    Gamma-based clustering via ordered means with application to gene-expression analysis

    Full text link
    Discrete mixture models provide a well-known basis for effective clustering algorithms, although technical challenges have limited their scope. In the context of gene-expression data analysis, a model is presented that mixes over a finite catalog of structures, each one representing equality and inequality constraints among latent expected values. Computations depend on the probability that independent gamma-distributed variables attain each of their possible orderings. Each ordering event is equivalent to an event in independent negative-binomial random variables, and this finding guides a dynamic-programming calculation. The structuring of mixture-model components according to constraints among latent means leads to strict concavity of the mixture log likelihood. In addition to its beneficial numerical properties, the clustering method shows promising results in an empirical study.Comment: Published in at http://dx.doi.org/10.1214/10-AOS805 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Common Protocol for Agent-Based Social Simulation

    Get PDF
    Traditional (i.e. analytical) modelling practices in the social sciences rely on a very well established, although implicit, methodological protocol, both with respect to the way models are presented and to the kinds of analysis that are performed. Unfortunately, computer-simulated models often lack such a reference to an accepted methodological standard. This is one of the main reasons for the scepticism among mainstream social scientists that results in low acceptance of papers with agent-based methodology in the top journals. We identify some methodological pitfalls that, according to us, are common in papers employing agent-based simulations, and propose appropriate solutions. We discuss each issue with reference to a general characterization of dynamic micro models, which encompasses both analytical and simulation models. In the way, we also clarify some confusing terminology. We then propose a three-stage process that could lead to the establishment of methodological standards in social and economic simulations.Agent-Based, Simulations, Methodology, Calibration, Validation, Sensitivity Analysis

    A Common Protocol for Agent-Based Social Simulation

    Get PDF
    Traditional (i.e. analytical) modelling practices in the social sciences rely on a very well established, although implicit, methodological protocol, both with respect to the way models are presented and to the kinds of analysis that are performed. Unfortunately, computer-simulated models often lack such a reference to an accepted methodological standard. This is one of the main reasons for the scepticism among mainstream social scientists that results in low acceptance of papers with agent-based methodology in the top journals. We identify some methodological pitfalls that, according to us, are common in papers employing agent-based simulations, and propose appropriate solutions. We discuss each issue with reference to a general characterization of dynamic micro models, which encompasses both analytical and simulation models. In the way, we also clarify some confusing terminology. We then propose a three-stage process that could lead to the establishment of methodological standards in social and economic simulations.Agent-based, simulations, methodology, calibration, validation.
    • …
    corecore