4,927,810 research outputs found
Estimation of Distribution Overlap of Urn Models
A classical problem in statistics is estimating the expected coverage of a
sample, which has had applications in gene expression, microbial ecology,
optimization, and even numismatics. Here we consider a related extension of
this problem to random samples of two discrete distributions. Specifically, we
estimate what we call the dissimilarity probability of a sample, i.e., the
probability of a draw from one distribution not being observed in k draws from
another distribution. We show our estimator of dissimilarity to be a
U-statistic and a uniformly minimum variance unbiased estimator of
dissimilarity over the largest appropriate range of k. Furthermore, despite the
non-Markovian nature of our estimator when applied sequentially over k, we show
it converges uniformly in probability to the dissimilarity parameter, and we
present criteria when it is approximately normally distributed and admits a
consistent jackknife estimator of its variance. As proof of concept, we analyze
V35 16S rRNA data to discern between various microbial environments. Other
potential applications concern any situation where dissimilarity of two
discrete distributions may be of interest. For instance, in SELEX experiments,
each urn could represent a random RNA pool and each draw a possible solution to
a particular binding site problem over that pool. The dissimilarity of these
pools is then related to the probability of finding binding site solutions in
one pool that are absent in the other.Comment: 27 pages, 4 figure
Species distribution models
Species distribution models are a group of methods often used to estimate
consequences of global change, to assess ecological status and for other ecological
applications. The main idea behind species distribution models is that the
geographical distributions of species can, to a large part, be explained by
environmental factors and that species distributions therefore can be predicted in
time or space. For robust and reliable applications, models need to be based on
sound ecological principles, predictions need to be as accurate as possible, and
model uncertainties need to be understood.
Two approaches are available for modelling entire species communities: (1) each
species can be modelled individually and independently of other species or (2)
community information can be incorporated into the models. The first study in this
thesis compares these two modelling approaches for predicting phytoplankton
assemblages in lakes. The results showed that predictive accuracy was higher when
species were modelled individually. The results also showed that phytoplankton can
be used for model-based assessment of ecological status. This finding is important
because phytoplankton is required for assessing the ecological status of European
water bodies according to the European Water Framework Directive.
Dispersal barriers in the landscape or limited dispersal ability of species might be a
reason for species being absent from suitable habitats, and these factors might
therefore affect model accuracy. The second study in this thesis examines the
influence of dispersal and the spatial configuration of ecosystems on prediction
accuracy of benthic invertebrate and phytoplankton distribution and assemblage
composition. The results showed only a minor influence of spatial configuration and
no effect of flight ability of invertebrates on model accuracy. However, the models
used may partly account for dispersal constraints, since dispersal-related factors, such
as lake surface area, are included as predictor variables. The result also showed that
composition of littoral invertebrate assemblages was easier to predict at sites located
in well-connected lake systems, possibly because the relatively unstable littoral zone
necessitates a need for species to re-colonize disturbed habitats from source
populations
Bounding the Equilibrium Distribution of Markov Population Models
Arguing about the equilibrium distribution of continuous-time Markov chains
can be vital for showing properties about the underlying systems. For example
in biological systems, bistability of a chemical reaction network can hint at
its function as a biological switch. Unfortunately, the state space of these
systems is infinite in most cases, preventing the use of traditional steady
state solution techniques. In this paper we develop a new approach to tackle
this problem by first retrieving geometric bounds enclosing a major part of the
steady state probability mass, followed by a more detailed analysis revealing
state-wise bounds.Comment: 4 page
Distribution-free specification tests of conditional models
This article proposes a class of asymptotically distribution-free specification tests for parametric conditional distributions. These tests are based on a martingale transform of a proper sequential empirical process of conditionally transformed data. Standard continuous functionals of this martingale provide omnibus tests while linear combinations of the orthogonal components in its spectral representation form a basis for directional tests. Finally, Neyman-type smooth tests, a compromise between directional and omnibus tests, are discussed. As a special example we study in detail the construction of directional tests for the null hypothesis of conditional normality versus heteroskedastic contiguous alternatives. A small Monte Carlo study shows that our tests attain the nominal level already for small sample sizes.Publicad
Time series models with an EGB2 conditional distribution
A time series model in which the signal is buried in noise that is non-Gaussian may throw up observations that, when judged by the Gaussian yardstick, are outliers. We describe an observation driven model, based on an exponential generalized beta distribution of the second kind (EGB2), in which the signal is a linear function of past values of the score of the conditional distribution. This specification produces a model that is not only easy to implement, but which also facilitates the development of a comprehensive and relatively straight-forward theory for the asymptotic distribution of the maximum likelihood estimator. The model is fitted to US macroeconomic time series and compared with Gaussian and Student-t models. A theory is then developed for an EGARCH model based on the EGB2 distribution and the model is fitted to exchange rate data. Finally dynamic location and scale models are combined and applied to data on the UK rate of inflation
Models for Light-Cone Meson Distribution Amplitudes
Leading-twist distribution amplitudes (DAs) of light mesons like pi,rho etc.
describe the leading nonperturbative hadronic contributions to exclusive QCD
reactions at large energy transfer, for instance electromagnetic form factors.
They also enter B decay amplitudes described in QCD factorisation, in
particular nonleptonic two-body decays. Being nonperturbative quantities, DAs
cannot be calculated from first principles, but have to be described by models.
Most models for DAs rely on a fixed order conformal expansion, which is
strictly valid for large factorisation scales, but not always sufficient in
phenomenological applications. We derive models for DAs that are valid to all
orders in the conformal expansion and characterised by a small number of
parameters which are related to experimental observables.Comment: 19 pages, 10 figure
Systematic comparison of trip distribution laws and models
Trip distribution laws are basic for the travel demand characterization
needed in transport and urban planning. Several approaches have been considered
in the last years. One of them is the so-called gravity law, in which the
number of trips is assumed to be related to the population at origin and
destination and to decrease with the distance. The mathematical expression of
this law resembles Newton's law of gravity, which explains its name. Another
popular approach is inspired by the theory of intervening opportunities which
argues that the distance has no effect on the destination choice, playing only
the role of a surrogate for the number of intervening opportunities between
them. In this paper, we perform a thorough comparison between these two
approaches in their ability at estimating commuting flows by testing them
against empirical trip data at different scales and coming from different
countries. Different versions of the gravity and the intervening opportunities
laws, including the recently proposed radiation law, are used to estimate the
probability that an individual has to commute from one unit to another, called
trip distribution law. Based on these probability distribution laws, the
commuting networks are simulated with different trip distribution models. We
show that the gravity law performs better than the intervening opportunities
laws to estimate the commuting flows, to preserve the structure of the network
and to fit the commuting distance distribution although it fails at predicting
commuting flows at large distances. Finally, we show that the different
approaches can be used in the absence of detailed data for calibration since
their only parameter depends only on the scale of the geographic unit.Comment: 15 pages, 10 figure
Statistical distribution of components of energy eigenfunctions: from nearly-integrable to chaotic
We study the statistical distribution of components in the non-perturbative
parts of energy eigenfunctions (EFs), in which main bodies of the EFs lie. Our
numerical simulations in five models show that deviation of the distribution
from the prediction of random matrix theory (RMT) is useful in characterizing
the process from nearly-integrable to chaotic, in a way somewhat similar to the
nearest-level-spacing distribution. But, the statistics of EFs reveals some
more properties, as described below. (i) In the process of approaching quantum
chaos, the distribution of components shows a delay feature compared with the
nearest-level-spacing distribution in most of the models studied. (ii) In the
quantum chaotic regime, the distribution of components always shows small but
notable deviation from the prediction of RMT in models possessing classical
unterparts, while, the deviation can be almost negligible in models not
possessing classical counterparts. (iii) In models whose Hamiltonian matrices
possess a clear band structure, tails of EFs show statistical behaviors
obviously different from those in the main bodies, while, the difference is
smaller for Hamiltonian matrices without a clear band structure.Comment: 10 pages, 10 figure
- …
