1,995 research outputs found

    Discrete probability models to assess spatial distribution patterns in natural populations and an algorithm for likelihood ratio goodness of fit test

    Get PDF
    Population spatial distribution analysis allow environmental researchers to describe,and understand how individuals (study subjects) grow and interact in a given study site,this information might be used in numberless applications from classical ecology, pestmanagement, sample design optimization, particles dispersion patterns, so forth, toepidemiology and public health. Probability discrete models (Poisson, Binomial andNegative Binomial) are used to asses the three principal spatial patterns (random,uniform and aggregated distributions respectively). In this paper a matlab algorithm ispresented to perform spatial patterns analysis through the evaluation of probabilitymodels. Likelihood Ratio Goodness of Fit Test (G-test) was used to test for agreementbetween observed vs expected density data for the three probability distributions, andtwo sets of random count data (m = 100 and 2229) were simulated for the threeprobability distributions in order to test the algorithm. Results showed that thealgorithm was sensitive in assessing for agreement random generated counts for thethree discrete probability models but in less measure for contagious distribution whenm = 2229 (p > 0.05 for poisson and binomial models, and p < 0.05 for negativebinomial model in both cases). Likelihood Ratio test reported significant differencefrom negative binomial when in fact it was the population distribution for m = 2229,although graphical distribution analysis showed agreement between observed andexpected negative binomial counts

    The statistics of multi-planet systems

    Full text link
    We describe statistical methods for measuring the exoplanet multiplicity function - the fraction of host stars containing a given number of planets - from transit and radial-velocity surveys. The analysis is based on the approximation of separability - that the distribution of planetary parameters in an n-planet system is the product of identical 1-planet distributions. We review the evidence that separability is a valid approximation for exoplanets. We show how to relate the observable multiplicity function in surveys with similar host-star populations but different sensitivities. We also show how to correct for geometrical selection effects to derive the multiplicity function from transit surveys if the distribution of relative inclinations is known. Applying these tools to the Kepler transit survey and radial-velocity surveys, we find that (i) the Kepler data alone do not constrain the mean inclination of multi-planet systems; even spherical distributions are allowed by the data but only if a small fraction of host stars contain large planet populations (> 30); (ii) comparing the Kepler and radial-velocity surveys shows that the mean inclination of multi-planet systems lies in the range 0-5 degrees; (iii) the multiplicity function of the Kepler planets is not well-determined by the present data.Comment: 34 pages, 10 figure

    Robustness in sparse linear models: relative efficiency based on robust approximate message passing

    Full text link
    Understanding efficiency in high dimensional linear models is a longstanding problem of interest. Classical work with smaller dimensional problems dating back to Huber and Bickel has illustrated the benefits of efficient loss functions. When the number of parameters pp is of the same order as the sample size nn, p≈np \approx n, an efficiency pattern different from the one of Huber was recently established. In this work, we consider the effects of model selection on the estimation efficiency of penalized methods. In particular, we explore whether sparsity, results in new efficiency patterns when p>np > n. In the interest of deriving the asymptotic mean squared error for regularized M-estimators, we use the powerful framework of approximate message passing. We propose a novel, robust and sparse approximate message passing algorithm (RAMP), that is adaptive to the error distribution. Our algorithm includes many non-quadratic and non-differentiable loss functions. We derive its asymptotic mean squared error and show its convergence, while allowing p,n,s→∞p, n, s \to \infty, with n/p∈(0,1)n/p \in (0,1) and n/s∈(1,∞)n/s \in (1,\infty). We identify new patterns of relative efficiency regarding a number of penalized MM estimators, when pp is much larger than nn. We show that the classical information bound is no longer reachable, even for light--tailed error distributions. We show that the penalized least absolute deviation estimator dominates the penalized least square estimator, in cases of heavy--tailed distributions. We observe this pattern for all choices of the number of non-zero parameters ss, both s≤ns \leq n and s≈ns \approx n. In non-penalized problems where s=p≈ns =p \approx n, the opposite regime holds. Therefore, we discover that the presence of model selection significantly changes the efficiency patterns.Comment: 49 pages, 10 figure

    Discrete generalized half-normal distribution and its applications in quantile regression

    Get PDF
    A new discrete two-parameter distribution is introduced by discretizing a generalized half-normal distribution. The model is useful for fitting overdispersed as well as underdispersed data. The failure function can be decreasing, bathtub shaped or increasing. A reparameterization of the distribution is introduced for use in a regression model based on the median. The behaviour of the maximum likelihood estimates is studied numerically, showing good performance in finite samples. Three real data set applications reveal that the new model can provide a better explanation than some other competitors

    Optimal Experimental Design for Partially Observable Pure Birth Processes

    Full text link
    We develop an efficient algorithm to find optimal observation times by maximizing the Fisher information for the birth rate of a partially observable pure birth process involving nn observations. Partially observable implies that at each of the nn observation time points for counting the number of individuals present in the pure birth process, each individual is observed independently with a fixed probability pp, modeling detection difficulties or constraints on resources. We apply concepts and techniques from generating functions, using a combination of symbolic and numeric computation, to establish a recursion for evaluating and optimizing the Fisher information. Our numerical results reveal the efficacy of this new method. An implementation of the algorithm is available publicly
    • …
    corecore