8,196 research outputs found
A Multiscale Approach for Statistical Characterization of Functional Images
Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements
The effects of weather and climate change on dengue
There is much uncertainty about the future impact of climate change on vector-borne diseases. Such uncertainty reflects the difficulties in modelling the complex interactions between disease, climatic and socioeconomic determinants. We used a comprehensive panel dataset from Mexico covering 23 years of province-specific dengue reports across nine climatic regions to estimate the impact of weather on dengue, accounting for the effects of non-climatic factors
Challenges of Big Data Analysis
Big Data bring new opportunities to modern society and challenges to data
scientists. On one hand, Big Data hold great promises for discovering subtle
population patterns and heterogeneities that are not possible with small-scale
data. On the other hand, the massive sample size and high dimensionality of Big
Data introduce unique computational and statistical challenges, including
scalability and storage bottleneck, noise accumulation, spurious correlation,
incidental endogeneity, and measurement errors. These challenges are
distinguished and require new computational and statistical paradigm. This
article give overviews on the salient features of Big Data and how these
features impact on paradigm change on statistical and computational methods as
well as computing architectures. We also provide various new perspectives on
the Big Data analysis and computation. In particular, we emphasis on the
viability of the sparsest solution in high-confidence set and point out that
exogeneous assumptions in most statistical methods for Big Data can not be
validated due to incidental endogeneity. They can lead to wrong statistical
inferences and consequently wrong scientific conclusions
Diversification of myco-heterotrophic angiosperms: evidence from Burmanniaceae.
Background - Myco-heterotrophy evolved independently several times during angiosperm evolution. Although many species of myco-heterotrophic plants are highly endemic and long-distance dispersal seems unlikely, some genera are widely dispersed and have pantropical distributions, often with large disjunctions. Traditionally this has been interpreted as evidence for an old age of these taxa. However, due to their scarcity and highly reduced plastid genomes our understanding about the evolutionary histories of the angiosperm myco-heterotrophic groups is poor. Results - We provide a hypothesis for the diversification of the myco-heterotrophic family Burmanniaceae. Phylogenetic inference, combined with biogeographical analyses, molecular divergence time estimates, and diversification analyses suggest that Burmanniaceae originated in West Gondwana and started to diversify during the Late Cretaceous. Diversification and migration of the species-rich pantropical genera Burmannia and Gymnosiphon display congruent patterns. Diversification began during the Eocene, when global temperatures peaked and tropical forests occurred at low latitudes. Simultaneous migration from the New to the Old World in Burmannia and Gymnosiphon occurred via boreotropical migration routes. Subsequent Oligocene cooling and breakup of boreotropical flora ended New-Old World migration and caused a gradual decrease in diversification rate in Burmanniaceae. Conclusion - Our results indicate that extant diversity and pantropical distribution of myco-heterotrophic Burmanniaceae is the result of diversification and boreotropical migration during the Eocene when tropical rain forest expanded dramaticall
Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection
A number of variable selection methods have been proposed involving nonconvex
penalty functions. These methods, which include the smoothly clipped absolute
deviation (SCAD) penalty and the minimax concave penalty (MCP), have been
demonstrated to have attractive theoretical properties, but model fitting is
not a straightforward task, and the resulting solutions may be unstable. Here,
we demonstrate the potential of coordinate descent algorithms for fitting these
models, establishing theoretical convergence properties and demonstrating that
they are significantly faster than competing approaches. In addition, we
demonstrate the utility of convexity diagnostics to determine regions of the
parameter space in which the objective function is locally convex, even though
the penalty is not. Our simulation study and data examples indicate that
nonconvex penalties like MCP and SCAD are worthwhile alternatives to the lasso
in many applications. In particular, our numerical results suggest that MCP is
the preferred approach among the three methods.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS388 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Locally adaptive smoothing with Markov random fields and shrinkage priors
We present a locally adaptive nonparametric curve fitting method that
operates within a fully Bayesian framework. This method uses shrinkage priors
to induce sparsity in order-k differences in the latent trend function,
providing a combination of local adaptation and global control. Using a scale
mixture of normals representation of shrinkage priors, we make explicit
connections between our method and kth order Gaussian Markov random field
smoothing. We call the resulting processes shrinkage prior Markov random fields
(SPMRFs). We use Hamiltonian Monte Carlo to approximate the posterior
distribution of model parameters because this method provides superior
performance in the presence of the high dimensionality and strong parameter
correlations exhibited by our models. We compare the performance of three prior
formulations using simulated data and find the horseshoe prior provides the
best compromise between bias and precision. We apply SPMRF models to two
benchmark data examples frequently used to test nonparametric methods. We find
that this method is flexible enough to accommodate a variety of data generating
models and offers the adaptive properties and computational tractability to
make it a useful addition to the Bayesian nonparametric toolbox.Comment: 38 pages, to appear in Bayesian Analysi
- …