632 research outputs found
Differential Evolution Markov Chain with snooker updater and fewer chains
Differential Evolution Markov Chain (DE-MC) is an adaptive MCMC algorithm, in which multiple chains are run in parallel. Standard DE-MC requires at least N=2d chains to be run in parallel, where d is the dimensionality of the posterior. This paper extends DE-MC with a snooker updater and shows by simulation and real examples that DE-MC can work for d up to 50–100 with fewer parallel chains (e.g. N=3) by exploiting information from their past by generating jumps from differences of pairs of past states. This approach extends the practical applicability of DE-MC and is shown to be about 5–26 times more efficient than the optimal Normal random walk Metropolis sampler for the 97.5% point of a variable from a 25–50 dimensional Student t 3 distribution. In a nonlinear mixed effects model example the approach outperformed a block-updater geared to the specific features of the mode
Genetic algorithms and Markov Chain Monte Carlo: Differential Evolution Markov Chain makes Bayesian computing easy
Differential Evolution (DE) is a simple genetic algorithm for numerical optimization in real parameter spaces. In a statistical context one would not just want the optimum but also its uncertainty. The uncertainty distribution can be obtained by a Bayesian analysis (after specifying prior and likelihood) using Markov Chain Monte Carlo (MCMC) simulation. In this paper the essential ideas of DE and MCMC are integrated into Differential Evolution Markov Chain (DE-MC). DE-MC is a population MCMC algorithm, in which multiple chains are run in parallel. DE-MC solves an important problem in MCMC, namely that of choosing an appropriate scale and orientation for the jumping distribution. In DE-MC the jumps are simply a multiple of the differences of two random parameter vectors that are currently in the population. Simulations and examples illustrate the potential of DE-MC. The advantage of DE-MC over conventional MCMC are simplicity, speed of calculation and convergence, even for nearly collinear parameters and multimodal densitie
Iteratio: Calculating environmental indicator values for species and relevés.
Question: Is it possible to translate vegetation maps into reliable thematic maps of site conditions? Method: This paper presents a new method, called Iteratio, by which a coherent spatial overview of specific environmental conditions can be obtained from a comprehensive vegetation survey of a specific area. Iteratio is a database application which calculates environmental indicator values for vegetation samples (relevés) on the basis of known indicator values of a limited number of plant species. The outcome is then linked to a digitalized vegetation map (map of plant communities) which results in a spatial overview of site conditions. Iteratio requires the indicator values of a minimum of 10–20% of the species occurring. The species are given a relative weight according to their amplitudes: species with a narrow range are weighted stronger, species with a broad range are weighted weaker. Conclusion: The method presented here enables a coherent assessment of site conditions on the basis of a vegetation survey and the indicator values of a limited number of plant species
Co-correspondence analysis: a new ordination method to relate two community compositions
A new ordination method, called co-correspondence analysis, is developed to relate two types of communities (e.g., a plant community and an animal community) sampled at a common set of sites in a direct way. The method improves the simple, indirect approach of applying correspondence analysis (reciprocal averaging) to the separate species data sets and correlating the resulting ordination axes. Co-correspondence analysis maximizes the weighted covariance between weighted averaged species scores of one community and weighted averaged species scores of the other community. It thus attempts to identify the patterns that are common to both communities. Both a symmetric descriptive and an asymmetric predictive form are developed. The symmetric form relates to co-inertia analysis and the asymmetric, predictive form to partial least-squares regression. In two examples the predictive power of co-correspondence analysis is compared with that of canonical correspondence analyses on syntaxonomic and environmental data. In the first example, carabid beetles in roadside verges are shown to be more closely related to plant species composition than to vegetation structure (biomass, height, roughness, among others), and, in the second example, bryophytes in spring meadows are shown to be more closely related to the species composition of the vascular plants than to the measured water chemistry
A theory of gradient analysis
The theory of gradient analysis is presented in this chapter, in which the heuristic techniques are integrated with regression, calibration, ordination and constrained ordination as distinct, well-defined statistical problems. The various techniques used for each type of problem are classified into families according to their implicit response model and the method used to estimate parameters of the model. Three such families are considered. First, the family of standard statistical techniques based on the linear response model is dealt with, because they are conceptually the simplest and provide a basis for what follows, even though their ecological application is restricted. Second, a family of somewhat more complex statistical techniques are outlined which are formal extensions of the standard linear techniques and incorporate unimodal (Gaussian-like) response models explicitly. Finally, the family of heuristic techniques is considered based on weighted averaging. These are not more complex than the standard linear techniques, but implicitly fit a simple unimodal response model rather than a linear one. Ordination diagrams and their interpretation on bi plots and joint plots are also given in the chapter. This chapter has discussed which response model to choose from direct and indirect gradient analysis, and then in direct system, which one to choose from regression and constrained ordination
Approximating a similarity matrix by a latent class model: A reappraisal of additive fuzzy clustering
Let Q be a given n×n square symmetric matrix of nonnegative elements between 0 and 1, similarities. Fuzzy clustering results in fuzzy assignment of individuals to K clusters. In additive fuzzy clustering, the n×K fuzzy memberships matrix P is found by least-squares approximation of the off-diagonal elements of Q by inner products of rows of P. By contrast, kernelized fuzzy c-means is not least-squares and requires an additional fuzziness parameter. The aim is to popularize additive fuzzy clustering by interpreting it as a latent class model, whereby the elements of Q are modeled as the probability that two individuals share the same class on the basis of the assignment probability matrix P. Two new algorithms are provided, a brute force genetic algorithm (differential evolution) and an iterative row-wise quadratic programming algorithm of which the latter is the more effective. Simulations showed that (1) the method usually has a unique solution, except in special cases, (2) both algorithms reached this solution from random restarts and (3) the number of clusters can be well estimated by AIC. Additive fuzzy clustering is computationally efficient and combines attractive features of both the vector model and the cluster mode
- …