4,927,810 research outputs found

    Estimation of Distribution Overlap of Urn Models

    Get PDF
    A classical problem in statistics is estimating the expected coverage of a sample, which has had applications in gene expression, microbial ecology, optimization, and even numismatics. Here we consider a related extension of this problem to random samples of two discrete distributions. Specifically, we estimate what we call the dissimilarity probability of a sample, i.e., the probability of a draw from one distribution not being observed in k draws from another distribution. We show our estimator of dissimilarity to be a U-statistic and a uniformly minimum variance unbiased estimator of dissimilarity over the largest appropriate range of k. Furthermore, despite the non-Markovian nature of our estimator when applied sequentially over k, we show it converges uniformly in probability to the dissimilarity parameter, and we present criteria when it is approximately normally distributed and admits a consistent jackknife estimator of its variance. As proof of concept, we analyze V35 16S rRNA data to discern between various microbial environments. Other potential applications concern any situation where dissimilarity of two discrete distributions may be of interest. For instance, in SELEX experiments, each urn could represent a random RNA pool and each draw a possible solution to a particular binding site problem over that pool. The dissimilarity of these pools is then related to the probability of finding binding site solutions in one pool that are absent in the other.Comment: 27 pages, 4 figure

    Species distribution models

    Get PDF
    Species distribution models are a group of methods often used to estimate consequences of global change, to assess ecological status and for other ecological applications. The main idea behind species distribution models is that the geographical distributions of species can, to a large part, be explained by environmental factors and that species distributions therefore can be predicted in time or space. For robust and reliable applications, models need to be based on sound ecological principles, predictions need to be as accurate as possible, and model uncertainties need to be understood. Two approaches are available for modelling entire species communities: (1) each species can be modelled individually and independently of other species or (2) community information can be incorporated into the models. The first study in this thesis compares these two modelling approaches for predicting phytoplankton assemblages in lakes. The results showed that predictive accuracy was higher when species were modelled individually. The results also showed that phytoplankton can be used for model-based assessment of ecological status. This finding is important because phytoplankton is required for assessing the ecological status of European water bodies according to the European Water Framework Directive. Dispersal barriers in the landscape or limited dispersal ability of species might be a reason for species being absent from suitable habitats, and these factors might therefore affect model accuracy. The second study in this thesis examines the influence of dispersal and the spatial configuration of ecosystems on prediction accuracy of benthic invertebrate and phytoplankton distribution and assemblage composition. The results showed only a minor influence of spatial configuration and no effect of flight ability of invertebrates on model accuracy. However, the models used may partly account for dispersal constraints, since dispersal-related factors, such as lake surface area, are included as predictor variables. The result also showed that composition of littoral invertebrate assemblages was easier to predict at sites located in well-connected lake systems, possibly because the relatively unstable littoral zone necessitates a need for species to re-colonize disturbed habitats from source populations

    Bounding the Equilibrium Distribution of Markov Population Models

    Get PDF
    Arguing about the equilibrium distribution of continuous-time Markov chains can be vital for showing properties about the underlying systems. For example in biological systems, bistability of a chemical reaction network can hint at its function as a biological switch. Unfortunately, the state space of these systems is infinite in most cases, preventing the use of traditional steady state solution techniques. In this paper we develop a new approach to tackle this problem by first retrieving geometric bounds enclosing a major part of the steady state probability mass, followed by a more detailed analysis revealing state-wise bounds.Comment: 4 page

    Distribution-free specification tests of conditional models

    Get PDF
    This article proposes a class of asymptotically distribution-free specification tests for parametric conditional distributions. These tests are based on a martingale transform of a proper sequential empirical process of conditionally transformed data. Standard continuous functionals of this martingale provide omnibus tests while linear combinations of the orthogonal components in its spectral representation form a basis for directional tests. Finally, Neyman-type smooth tests, a compromise between directional and omnibus tests, are discussed. As a special example we study in detail the construction of directional tests for the null hypothesis of conditional normality versus heteroskedastic contiguous alternatives. A small Monte Carlo study shows that our tests attain the nominal level already for small sample sizes.Publicad

    Time series models with an EGB2 conditional distribution

    Get PDF
    A time series model in which the signal is buried in noise that is non-Gaussian may throw up observations that, when judged by the Gaussian yardstick, are outliers. We describe an observation driven model, based on an exponential generalized beta distribution of the second kind (EGB2), in which the signal is a linear function of past values of the score of the conditional distribution. This specification produces a model that is not only easy to implement, but which also facilitates the development of a comprehensive and relatively straight-forward theory for the asymptotic distribution of the maximum likelihood estimator. The model is fitted to US macroeconomic time series and compared with Gaussian and Student-t models. A theory is then developed for an EGARCH model based on the EGB2 distribution and the model is fitted to exchange rate data. Finally dynamic location and scale models are combined and applied to data on the UK rate of inflation

    Models for Light-Cone Meson Distribution Amplitudes

    Full text link
    Leading-twist distribution amplitudes (DAs) of light mesons like pi,rho etc. describe the leading nonperturbative hadronic contributions to exclusive QCD reactions at large energy transfer, for instance electromagnetic form factors. They also enter B decay amplitudes described in QCD factorisation, in particular nonleptonic two-body decays. Being nonperturbative quantities, DAs cannot be calculated from first principles, but have to be described by models. Most models for DAs rely on a fixed order conformal expansion, which is strictly valid for large factorisation scales, but not always sufficient in phenomenological applications. We derive models for DAs that are valid to all orders in the conformal expansion and characterised by a small number of parameters which are related to experimental observables.Comment: 19 pages, 10 figure

    Systematic comparison of trip distribution laws and models

    Full text link
    Trip distribution laws are basic for the travel demand characterization needed in transport and urban planning. Several approaches have been considered in the last years. One of them is the so-called gravity law, in which the number of trips is assumed to be related to the population at origin and destination and to decrease with the distance. The mathematical expression of this law resembles Newton's law of gravity, which explains its name. Another popular approach is inspired by the theory of intervening opportunities which argues that the distance has no effect on the destination choice, playing only the role of a surrogate for the number of intervening opportunities between them. In this paper, we perform a thorough comparison between these two approaches in their ability at estimating commuting flows by testing them against empirical trip data at different scales and coming from different countries. Different versions of the gravity and the intervening opportunities laws, including the recently proposed radiation law, are used to estimate the probability that an individual has to commute from one unit to another, called trip distribution law. Based on these probability distribution laws, the commuting networks are simulated with different trip distribution models. We show that the gravity law performs better than the intervening opportunities laws to estimate the commuting flows, to preserve the structure of the network and to fit the commuting distance distribution although it fails at predicting commuting flows at large distances. Finally, we show that the different approaches can be used in the absence of detailed data for calibration since their only parameter depends only on the scale of the geographic unit.Comment: 15 pages, 10 figure

    Statistical distribution of components of energy eigenfunctions: from nearly-integrable to chaotic

    Full text link
    We study the statistical distribution of components in the non-perturbative parts of energy eigenfunctions (EFs), in which main bodies of the EFs lie. Our numerical simulations in five models show that deviation of the distribution from the prediction of random matrix theory (RMT) is useful in characterizing the process from nearly-integrable to chaotic, in a way somewhat similar to the nearest-level-spacing distribution. But, the statistics of EFs reveals some more properties, as described below. (i) In the process of approaching quantum chaos, the distribution of components shows a delay feature compared with the nearest-level-spacing distribution in most of the models studied. (ii) In the quantum chaotic regime, the distribution of components always shows small but notable deviation from the prediction of RMT in models possessing classical unterparts, while, the deviation can be almost negligible in models not possessing classical counterparts. (iii) In models whose Hamiltonian matrices possess a clear band structure, tails of EFs show statistical behaviors obviously different from those in the main bodies, while, the difference is smaller for Hamiltonian matrices without a clear band structure.Comment: 10 pages, 10 figure
    corecore