1,995 research outputs found
Discrete probability models to assess spatial distribution patterns in natural populations and an algorithm for likelihood ratio goodness of fit test
Population spatial distribution analysis allow environmental researchers to describe,and understand how individuals (study subjects) grow and interact in a given study site,this information might be used in numberless applications from classical ecology, pestmanagement, sample design optimization, particles dispersion patterns, so forth, toepidemiology and public health. Probability discrete models (Poisson, Binomial andNegative Binomial) are used to asses the three principal spatial patterns (random,uniform and aggregated distributions respectively). In this paper a matlab algorithm ispresented to perform spatial patterns analysis through the evaluation of probabilitymodels. Likelihood Ratio Goodness of Fit Test (G-test) was used to test for agreementbetween observed vs expected density data for the three probability distributions, andtwo sets of random count data (m = 100 and 2229) were simulated for the threeprobability distributions in order to test the algorithm. Results showed that thealgorithm was sensitive in assessing for agreement random generated counts for thethree discrete probability models but in less measure for contagious distribution whenm = 2229 (p > 0.05 for poisson and binomial models, and p < 0.05 for negativebinomial model in both cases). Likelihood Ratio test reported significant differencefrom negative binomial when in fact it was the population distribution for m = 2229,although graphical distribution analysis showed agreement between observed andexpected negative binomial counts
The statistics of multi-planet systems
We describe statistical methods for measuring the exoplanet multiplicity
function - the fraction of host stars containing a given number of planets -
from transit and radial-velocity surveys. The analysis is based on the
approximation of separability - that the distribution of planetary parameters
in an n-planet system is the product of identical 1-planet distributions. We
review the evidence that separability is a valid approximation for exoplanets.
We show how to relate the observable multiplicity function in surveys with
similar host-star populations but different sensitivities. We also show how to
correct for geometrical selection effects to derive the multiplicity function
from transit surveys if the distribution of relative inclinations is known.
Applying these tools to the Kepler transit survey and radial-velocity surveys,
we find that (i) the Kepler data alone do not constrain the mean inclination of
multi-planet systems; even spherical distributions are allowed by the data but
only if a small fraction of host stars contain large planet populations (> 30);
(ii) comparing the Kepler and radial-velocity surveys shows that the mean
inclination of multi-planet systems lies in the range 0-5 degrees; (iii) the
multiplicity function of the Kepler planets is not well-determined by the
present data.Comment: 34 pages, 10 figure
Robustness in sparse linear models: relative efficiency based on robust approximate message passing
Understanding efficiency in high dimensional linear models is a longstanding
problem of interest. Classical work with smaller dimensional problems dating
back to Huber and Bickel has illustrated the benefits of efficient loss
functions. When the number of parameters is of the same order as the sample
size , , an efficiency pattern different from the one of Huber
was recently established. In this work, we consider the effects of model
selection on the estimation efficiency of penalized methods. In particular, we
explore whether sparsity, results in new efficiency patterns when . In
the interest of deriving the asymptotic mean squared error for regularized
M-estimators, we use the powerful framework of approximate message passing. We
propose a novel, robust and sparse approximate message passing algorithm
(RAMP), that is adaptive to the error distribution. Our algorithm includes many
non-quadratic and non-differentiable loss functions. We derive its asymptotic
mean squared error and show its convergence, while allowing , with and . We identify new
patterns of relative efficiency regarding a number of penalized estimators,
when is much larger than . We show that the classical information bound
is no longer reachable, even for light--tailed error distributions. We show
that the penalized least absolute deviation estimator dominates the penalized
least square estimator, in cases of heavy--tailed distributions. We observe
this pattern for all choices of the number of non-zero parameters , both and . In non-penalized problems where ,
the opposite regime holds. Therefore, we discover that the presence of model
selection significantly changes the efficiency patterns.Comment: 49 pages, 10 figure
Discrete generalized half-normal distribution and its applications in quantile regression
A new discrete two-parameter distribution is introduced by discretizing a generalized half-normal distribution. The model is useful for fitting overdispersed as well as underdispersed data. The failure function can be decreasing, bathtub shaped or increasing. A reparameterization of the distribution is introduced for use in a regression model based on the median. The behaviour of the maximum likelihood estimates is studied numerically, showing good performance in finite samples. Three real data set applications reveal that the new model can provide a better explanation than some other competitors
Optimal Experimental Design for Partially Observable Pure Birth Processes
We develop an efficient algorithm to find optimal observation times by
maximizing the Fisher information for the birth rate of a partially observable
pure birth process involving observations. Partially observable implies
that at each of the observation time points for counting the number of
individuals present in the pure birth process, each individual is observed
independently with a fixed probability , modeling detection difficulties or
constraints on resources. We apply concepts and techniques from generating
functions, using a combination of symbolic and numeric computation, to
establish a recursion for evaluating and optimizing the Fisher information. Our
numerical results reveal the efficacy of this new method. An implementation of
the algorithm is available publicly
- …