8,642 research outputs found
Modeling Binary Time Series Using Gaussian Processes with Application to Predicting Sleep States
Motivated by the problem of predicting sleep states, we develop a mixed
effects model for binary time series with a stochastic component represented by
a Gaussian process. The fixed component captures the effects of covariates on
the binary-valued response. The Gaussian process captures the residual
variations in the binary response that are not explained by covariates and past
realizations. We develop a frequentist modeling framework that provides
efficient inference and more accurate predictions. Results demonstrate the
advantages of improved prediction rates over existing approaches such as
logistic regression, generalized additive mixed model, models for ordinal data,
gradient boosting, decision tree and random forest. Using our proposed model,
we show that previous sleep state and heart rates are significant predictors
for future sleep states. Simulation studies also show that our proposed method
is promising and robust. To handle computational complexity, we utilize Laplace
approximation, golden section search and successive parabolic interpolation.
With this paper, we also submit an R-package (HIBITS) that implements the
proposed procedure.Comment: Journal of Classification (2018
Used-habitat calibration plots: a new procedure for validating species distribution, resource selection, and step-selection models
“Species distribution modeling” was recently ranked as one of the top five “research fronts” in ecology and the environmental sciences by ISI's Essential Science Indicators (Renner and Warton 2013), reflecting the importance of predicting how species distributions will respond to anthropogenic change. Unfortunately, species distribution models (SDMs) often perform poorly when applied to novel environments. Compounding on this problem is the shortage of methods for evaluating SDMs (hence, we may be getting our predictions wrong and not even know it). Traditional methods for validating SDMs quantify a model's ability to classify locations as used or unused. Instead, we propose to focus on how well SDMs can predict the characteristics of used locations. This subtle shift in viewpoint leads to a more natural and informative evaluation and validation of models across the entire spectrum of SDMs. Through a series of examples, we show how simple graphical methods can help with three fundamental challenges of habitat modeling: identifying missing covariates, non-linearity, and multicollinearity. Identifying habitat characteristics that are not well-predicted by the model can provide insights into variables affecting the distribution of species, suggest appropriate model modifications, and ultimately improve the reliability and generality of conservation and management recommendations
Bayesian semiparametric analysis for two-phase studies of gene-environment interaction
The two-phase sampling design is a cost-efficient way of collecting expensive
covariate information on a judiciously selected subsample. It is natural to
apply such a strategy for collecting genetic data in a subsample enriched for
exposure to environmental factors for gene-environment interaction (G x E)
analysis. In this paper, we consider two-phase studies of G x E interaction
where phase I data are available on exposure, covariates and disease status.
Stratified sampling is done to prioritize individuals for genotyping at phase
II conditional on disease and exposure. We consider a Bayesian analysis based
on the joint retrospective likelihood of phases I and II data. We address
several important statistical issues: (i) we consider a model with multiple
genes, environmental factors and their pairwise interactions. We employ a
Bayesian variable selection algorithm to reduce the dimensionality of this
potentially high-dimensional model; (ii) we use the assumption of gene-gene and
gene-environment independence to trade off between bias and efficiency for
estimating the interaction parameters through use of hierarchical priors
reflecting this assumption; (iii) we posit a flexible model for the joint
distribution of the phase I categorical variables using the nonparametric Bayes
construction of Dunson and Xing [J. Amer. Statist. Assoc. 104 (2009)
1042-1051].Comment: Published in at http://dx.doi.org/10.1214/12-AOAS599 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …