8,196 research outputs found

    A Multiscale Approach for Statistical Characterization of Functional Images

    Get PDF
    Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements

    The effects of weather and climate change on dengue

    Get PDF
    There is much uncertainty about the future impact of climate change on vector-borne diseases. Such uncertainty reflects the difficulties in modelling the complex interactions between disease, climatic and socioeconomic determinants. We used a comprehensive panel dataset from Mexico covering 23 years of province-specific dengue reports across nine climatic regions to estimate the impact of weather on dengue, accounting for the effects of non-climatic factors

    Challenges of Big Data Analysis

    Full text link
    Big Data bring new opportunities to modern society and challenges to data scientists. On one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity, and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This article give overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasis on the viability of the sparsest solution in high-confidence set and point out that exogeneous assumptions in most statistical methods for Big Data can not be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions

    Diversification of myco-heterotrophic angiosperms: evidence from Burmanniaceae.

    Get PDF
    Background - Myco-heterotrophy evolved independently several times during angiosperm evolution. Although many species of myco-heterotrophic plants are highly endemic and long-distance dispersal seems unlikely, some genera are widely dispersed and have pantropical distributions, often with large disjunctions. Traditionally this has been interpreted as evidence for an old age of these taxa. However, due to their scarcity and highly reduced plastid genomes our understanding about the evolutionary histories of the angiosperm myco-heterotrophic groups is poor. Results - We provide a hypothesis for the diversification of the myco-heterotrophic family Burmanniaceae. Phylogenetic inference, combined with biogeographical analyses, molecular divergence time estimates, and diversification analyses suggest that Burmanniaceae originated in West Gondwana and started to diversify during the Late Cretaceous. Diversification and migration of the species-rich pantropical genera Burmannia and Gymnosiphon display congruent patterns. Diversification began during the Eocene, when global temperatures peaked and tropical forests occurred at low latitudes. Simultaneous migration from the New to the Old World in Burmannia and Gymnosiphon occurred via boreotropical migration routes. Subsequent Oligocene cooling and breakup of boreotropical flora ended New-Old World migration and caused a gradual decrease in diversification rate in Burmanniaceae. Conclusion - Our results indicate that extant diversity and pantropical distribution of myco-heterotrophic Burmanniaceae is the result of diversification and boreotropical migration during the Eocene when tropical rain forest expanded dramaticall

    Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection

    Full text link
    A number of variable selection methods have been proposed involving nonconvex penalty functions. These methods, which include the smoothly clipped absolute deviation (SCAD) penalty and the minimax concave penalty (MCP), have been demonstrated to have attractive theoretical properties, but model fitting is not a straightforward task, and the resulting solutions may be unstable. Here, we demonstrate the potential of coordinate descent algorithms for fitting these models, establishing theoretical convergence properties and demonstrating that they are significantly faster than competing approaches. In addition, we demonstrate the utility of convexity diagnostics to determine regions of the parameter space in which the objective function is locally convex, even though the penalty is not. Our simulation study and data examples indicate that nonconvex penalties like MCP and SCAD are worthwhile alternatives to the lasso in many applications. In particular, our numerical results suggest that MCP is the preferred approach among the three methods.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS388 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Locally adaptive smoothing with Markov random fields and shrinkage priors

    Full text link
    We present a locally adaptive nonparametric curve fitting method that operates within a fully Bayesian framework. This method uses shrinkage priors to induce sparsity in order-k differences in the latent trend function, providing a combination of local adaptation and global control. Using a scale mixture of normals representation of shrinkage priors, we make explicit connections between our method and kth order Gaussian Markov random field smoothing. We call the resulting processes shrinkage prior Markov random fields (SPMRFs). We use Hamiltonian Monte Carlo to approximate the posterior distribution of model parameters because this method provides superior performance in the presence of the high dimensionality and strong parameter correlations exhibited by our models. We compare the performance of three prior formulations using simulated data and find the horseshoe prior provides the best compromise between bias and precision. We apply SPMRF models to two benchmark data examples frequently used to test nonparametric methods. We find that this method is flexible enough to accommodate a variety of data generating models and offers the adaptive properties and computational tractability to make it a useful addition to the Bayesian nonparametric toolbox.Comment: 38 pages, to appear in Bayesian Analysi
    corecore