69 research outputs found
bsamGP: An R Package for Bayesian Spectral Analysis Models Using Gaussian Process Priors
The Bayesian spectral analysis model (BSAM) is a powerful tool to deal with semiparametric methods in regression and density estimation based on the spectral representation of Gaussian process priors. The bsamGP package for R provides a comprehensive set of programs for the implementation of fully Bayesian semiparametric methods based on BSAM. Currently, bsamGP includes semiparametric additive models for regression, generalized models and density estimation. In particular, bsamGP deals with constrained regression models with monotone, convex/concave, S-shaped and U-shaped functions by modeling derivatives of regression functions as squared Gaussian processes. bsamGP also contains Bayesian model selection procedures for testing the adequacy of a parametric model relative to a non-specific semiparametric alternative and the existence of the shape restriction. To maximize computational efficiency, we carry out posterior sampling algorithms of all models using compiled Fortran code. The package is illustrated through Bayesian semiparametric analyses of synthetic data and benchmark data
Quality vs. Quantity of Data in Contextual Decision-Making: Exact Analysis under Newsvendor Loss
When building datasets, one needs to invest time, money and energy to either
aggregate more data or to improve their quality. The most common practice
favors quantity over quality without necessarily quantifying the trade-off that
emerges. In this work, we study data-driven contextual decision-making and the
performance implications of quality and quantity of data. We focus on
contextual decision-making with a Newsvendor loss. This loss is that of a
central capacity planning problem in Operations Research, but also that
associated with quantile regression. We consider a model in which outcomes
observed in similar contexts have similar distributions and analyze the
performance of a classical class of kernel policies which weigh data according
to their similarity in a contextual space. We develop a series of results that
lead to an exact characterization of the worst-case expected regret of these
policies. This exact characterization applies to any sample size and any
observed contexts. The model we develop is flexible, and captures the case of
partially observed contexts. This exact analysis enables to unveil new
structural insights on the learning behavior of uniform kernel methods: i) the
specialized analysis leads to very large improvements in quantification of
performance compared to state of the art general purpose bounds. ii) we show an
important non-monotonicity of the performance as a function of data size not
captured by previous bounds; and iii) we show that in some regimes, a little
increase in the quality of the data can dramatically reduce the amount of
samples required to reach a performance target. All in all, our work
demonstrates that it is possible to quantify in a precise fashion the interplay
of data quality and quantity, and performance in a central problem class. It
also highlights the need for problem specific bounds in order to understand the
trade-offs at play
Estimation of an Order Book Dependent Hawkes Process for Large Datasets
A point process for event arrivals in high frequency trading is presented.
The intensity is the product of a Hawkes process and high dimensional functions
of covariates derived from the order book. Conditions for stationarity of the
process are stated. An algorithm is presented to estimate the model even in the
presence of billions of data points, possibly mapping covariates into a high
dimensional space. The large sample size can be common for high frequency data
applications using multiple liquid instruments. Convergence of the algorithm is
shown, consistency results under weak conditions is established, and a test
statistic to assess out of sample performance of different model specifications
is suggested. The methodology is applied to the study of four stocks that trade
on the New York Stock Exchange (NYSE). The out of sample testing procedure
suggests that capturing the nonlinearity of the order book information adds
value to the self exciting nature of high frequency trading events
- …