50 research outputs found
Machine Learning Methods with Time Series Dependence
We introduce the PrAGMaTiSt: Prediction and Analysis for Generalized Markov Time Series of States, a methodology which enhances classification algorithms so that they can accommodate sequential data. The PrAGMaTiSt can model a wide variety of time series structures including arbitrary order Markov chains, generalized and transition dependent generalized Markov chains, and variable length Markov chains. We subject our method as well as competitor methods to a rigorous set of simulations in order to understand its properties. We find, for very low or high levels of noise in , complexity of , or complexity of the time series structure, simple methods that either ignore the time series structure or model it as first order Markov can perform as well or better than more complicated models even when the latter are true; however, in moderate settings, the more complicated models tend to dominate. Furthermore, even with little training data, the more complicated models perform about as well as the simple ones when the latter are true. We also apply the PrAGMaTiSt to the important problem of sleep scoring of mice based on video data. Our procedure provides more accurate differentiation of the NREM and REM sleep states compared to any previous method in the field. The improvements in REM classification are particularly beneficial, as the dynamics of REM sleep are of special interest to sleep scientists. Furthermore, our procedure provides substantial improvements in capturing the sleep state bout duration distributions relative to other methods
A statistical analysis of multiple temperature proxies: Are reconstructions of surface temperatures over the last 1000 years reliable?
Predicting historic temperatures based on tree rings, ice cores, and other
natural proxies is a difficult endeavor. The relationship between proxies and
temperature is weak and the number of proxies is far larger than the number of
target data points. Furthermore, the data contain complex spatial and temporal
dependence structures which are not easily captured with simple models. In this
paper, we assess the reliability of such reconstructions and their statistical
significance against various null models. We find that the proxies do not
predict temperature significantly better than random series generated
independently of temperature. Furthermore, various model specifications that
perform similarly at predicting temperature produce extremely different
historical backcasts. Finally, the proxies seem unable to forecast the high
levels of and sharp run-up in temperature in the 1990s either in-sample or from
contiguous holdout blocks, thus casting doubt on their ability to predict such
phenomena if in fact they occurred several hundred years ago. We propose our
own reconstruction of Northern Hemisphere average annual land temperature over
the last millennium, assess its reliability, and compare it to those from the
climate science literature. Our model provides a similar reconstruction but has
much wider standard errors, reflecting the weak signal and large uncertainty
encountered in this setting.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS398 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A Bayesian Variable Selection Approach to Major League Baseball Hitting Metrics
Numerous statistics have been proposed for the measure of offensive ability
in major league baseball. While some of these measures may offer moderate
predictive power in certain situations, it is unclear which simple offensive
metrics are the most reliable or consistent. We address this issue with a
Bayesian hierarchical model for variable selection to capture which offensive
metrics are most predictive within players across time. Our sophisticated
methodology allows for full estimation of the posterior distributions for our
parameters and automatically adjusts for multiple testing, providing a distinct
advantage over alternative approaches. We implement our model on a set of 50
different offensive metrics and discuss our results in the context of
comparison to other variable selection techniques. We find that 33/50 metrics
demonstrate signal. However, these metrics are highly correlated with one
another and related to traditional notions of performance (e.g., plate
discipline, power, and ability to make contact)
Abandon Statistical Significance
We discuss problems the null hypothesis significance testing (NHST) paradigm
poses for replication and more broadly in the biomedical and social sciences as
well as how these problems remain unresolved by proposals involving modified
p-value thresholds, confidence intervals, and Bayes factors. We then discuss
our own proposal, which is to abandon statistical significance. We recommend
dropping the NHST paradigm--and the p-value thresholds intrinsic to it--as the
default statistical paradigm for research, publication, and discovery in the
biomedical and social sciences. Specifically, we propose that the p-value be
demoted from its threshold screening role and instead, treated continuously, be
considered along with currently subordinate factors (e.g., related prior
evidence, plausibility of mechanism, study design and data quality, real world
costs and benefits, novelty of finding, and other factors that vary by research
domain) as just one among many pieces of evidence. We have no desire to "ban"
p-values or other purely statistical measures. Rather, we believe that such
measures should not be thresholded and that, thresholded or not, they should
not take priority over the currently subordinate factors. We also argue that it
seldom makes sense to calibrate evidence as a function of p-values or other
purely statistical measures. We offer recommendations for how our proposal can
be implemented in the scientific publication process as well as in statistical
decision making more broadly
Decision Stages and Asymmetries in Regular Retail Price Pass-Through
We study the pass-through of wholesale price changes onto regular retail prices using an unusually detailed data set obtained from a major retailer. We model pass-through as a two-stage decision process that reflects both whether as well as how much to change the regular retail price. We show that pass-through is strongly asymmetric with respect to wholesale price increases versus decreases. Wholesale price increases are passed through to regular retail prices 70% of the time while wholesale price decreases are passed through only 9% of the time. Pass-through is also asymmetric with respect to the magnitude of the wholesale price change, with the magnitude affecting the response to wholesale price increases but not decreases. Finally, we show that covariates such as private label versus national brand, 99-cent price endings, and the time since the last wholesale price change have a much stronger impact on the first stage of the decision process (i.e., whether to change the regular retail price) than on the second stage (i.e., how much to change the regular retail price)