Search CORE

54 research outputs found

Bootstrap schemes for time series (in Russian)

Author: Peter Buhlmann
Publication venue
Publication date
Field of study

We review and compare block, sieve and local bootstraps for time series and thereby illuminate theoretical aspects of the procedures as well as their performance on finite-sample data. Our view is selective with the intention of providing a new and fair picture of some particular aspects of bootstrapping time series. The generality of the block bootstrap is contrasted with the sieve bootstrap. We discuss implementational advantages and disadvantages, and argue that the sieve often outperforms the block method. Local bootstraps, designed for nonparametric smoothing problems, are easy to use and implement but exhibit in some cases low performance.

Research Papers in Economics

Model Selection over Partially Ordered Sets

Author: Buhlmann Peter
Chandrasekaran Venkat
Taeb Armeen
Publication venue
Publication date: 20/08/2023
Field of study

In problems such as variable selection and graph estimation, models are characterized by Boolean logical structure such as presence or absence of a variable or an edge. Consequently, false positive and false negative errors can be specified as the number of variables or edges that are incorrectly included/excluded in an estimated model. However, there are several other problems such as ranking, clustering, and causal inference in which the associated model classes do not admit transparent notions of false positive and false negative errors due to the lack of an underlying Boolean logical structure. In this paper, we present a generic approach to endow a collection of models with partial order structure, which leads to a hierarchical organization of model classes as well as natural analogs of false positive and false negative errors. We describe model selection procedures that provide false positive error control in our general setting and we illustrate their utility with numerical experiments

arXiv.org e-Print Archive

PENALIZED LIKELIHOOD AND BAYESIAN METHODS FOR SPARSE CONTINGENCY TABLES: AN ANALYSIS OF ALTERNATIVE SPLICING IN FULL-LENGTH cDNA LIBRARIES

Author: Buhlmann Peter
Dahinden Corinne
Emerick Mark C.
Parmigiani Giovanni
Publication venue: Collection of Biostatistics Research Archive
Publication date: 07/11/2006
Field of study

We develop methods to perform model selection and parameter estimation in loglinear models for the analysis of sparse contingency tables to study the interaction of two or more factors. Typically, datasets arising from so-called full-length cDNA libraries, in the context of alternatively spliced genes, lead to such sparse contingency tables. Maximum Likelihood estimation of log-linear model coefficients fails to work because of zero cell entries. Therefore new methods are required to estimate the coefficients and to perform model selection. Our suggestions include computationally efficient penalization (Lasso-type) approaches as well as Bayesian methods using MCMC. We compare these procedures in a simulation study and we apply the proposed methods to full-length cDNA libraries, yielding valuable insight into the biological process of alternative splicing

Collection Of Biostatistics Research Archive

Sieve bootstrap for time series

Author: Peter Buhlmann
Peter Buhlmann
Publication venue
Publication date: 01/01/1997
Field of study

We study a bootstrap method which is based on the method of sieves. A linear process is approximated by a sequence of autoregressive processes of order p = p(n), where p(n)!1; p(n) = o(n) as the sample size n!1. For given data, we then estimate such anAR(p(n)) model and generate a bootstrap sample by resampling from the residuals. This sieve bootstrap enjoys a nice nonparametric property. We show its consistency for a class of nonlinear estimators and compare the procedure with the blockwise bootstrap, which has been proposed by Kunsch (1989). In particular, the sieve bootstrap variance of the mean is shown to have a better rate of convergence if the dependence between separated values of the underlying process decreases su ciently fast with growing separation. Finally a simulation study helps illustrating the advantages and disadvantages of the sieve compared to the blockwise bootstrap

CiteSeerX

Untersuchungen zum Einsatz von Feuerungsadditiven zur Reduzierung der Schadstoffemissionen aus großtechnischen Verbrennungsanlagen

Author: Buhlmann Peter
Publication venue: Shaker
Publication date: 01/01/2001
Field of study

Publikationsserver der RWTH Aachen University

Extreme events from the return-volume process: a discretization approach for complexity reduction

Author: Peter Buhlmann
Publication venue
Publication date
Field of study

We propose the discretization of real-valued financial time series into few ordinal values and use sparse Markov chains within the framework of generalized linear models for such categorical time series. The discretization operation causes a large reduction in the complexity of the data. We analyse daily return and volume data and estimate the probability structure of the process of lower extreme, upper extreme and the complementary usual events. Knowing the whole probability law of such ordinalvalued vector processes of extreme events of return and volume allows us to quantify non-linear associations. In particular, we find a new kind of asymmetry in the return - volume relationship. Estimated probabilities are also used to compute the MAP predictor whose power is found to be remarkably high.

Research Papers in Economics

Empirical Modeling of Extreme Events from Return{Volume Time Series in Stock Market

Author: Peter Buhlmann
Publication venue
Publication date
Field of study

We propose the discretization of real-valued nancial time series into few ordinal values and use non-linear likelihood modeling for sparse Markov chains within the framework of generalized linear models for categorical time series. We analyze daily return and volume data and estimate the probability structure of the process of extreme lower, extreme upper and the complementary usual events. Knowing the whole probability lawofsuch ordinal-valued vector processes of extreme events of return and volume allows us to quantify non-linear associations. In particular, we nd a (new kind of) asymmetry in the return{volume relationship which isa partial answer to a research issue given by Karpo (1987). We also propose a simple prediction algorithm which is based on an empirically selected model

CiteSeerX

Recommended from our members

Double-estimation-friendly inference for high-dimensional misspecified models

Author: Buhlmann Peter
Shah Rajen
Publication venue: Statistical Science
Publication date: 06/02/2022
Field of study

All models may be wrong---but that is not necessarily a problem for inference. Consider the standard

t

-test for the significance of a variable

X

for predicting response

Y

whilst controlling for

p

other covariates

Z

in a random design linear model. This yields correct asymptotic type~I error control for the null hypothesis that

X

is conditionally independent of

Y

given

Z

under an \emph{arbitrary} regression model of

Y

(X, Z)

, provided that a linear regression model for

X

Z

holds. An analogous robustness to misspecification, which we term the ``double-estimation-friendly'' (DEF) property, also holds for Wald tests in generalised linear models, with some small modifications. In this expository paper we explore this phenomenon, and propose methodology for high-dimensional regression settings that respects the DEF property. We advocate specifying (sparse) generalised linear regression models for both

Y

and the covariate of interest

X

; our framework gives valid inference for the conditional independence null if either of these hold. In the special case where both specifications are linear, our proposal amounts to a small modification of the popular debiased Lasso test. We also investigate constructing confidence intervals for the regression coefficient of

X

via inverting our tests; these have coverage guarantees even in partially linear models where the contribution of

Z

Y

can be arbitrary. Numerical experiments demonstrate the effectiveness of the methodology

Apollo (Cambridge)

Volatility and risk estimation with linear and nonlinear methods based on high frequency data

Author: Marcel Dettling
Peter Buhlmann
Publication venue
Publication date: 01/01/2004
Field of study

Accurate volatility predictions are crucial for the successful implementation of risk management. The use of high frequency data approximately renders volatility from a latent to an observable quantity, and opens new directions to forecast future volatilities. The goals in this paper are: (i) to select an accurate forecasting procedure for predicting volatilities based on high frequency data from various standard models and modern prediction tools; (ii) to evaluate the predictive potential of those volatility forecasts for both the realized and the true latent volatility; and (iii) to quantify the differences using volatility forecasts based on high frequency data and using a GARCH model for low frequency (e.g. daily) data, and study its implication in risk management for two widely used risk measures. The pay-off using high frequency data for the true latent volatility is empirically found to be still present, but magnitudes smaller than suggested by simple analysis.

Research Papers in Economics

ZHAW digitalcollection

Explaining Bagging

Author: Bin Yu
Peter Buhlmann
Publication venue
Publication date: 01/01/2000
Field of study

Bagging is one of the most eective computationally intensive procedures to improve on instable estimators or classiers, useful especially for high dimensional data set problems. Here we formalize the notion of instability and derive theoretical results to explain a variance reduction eect of bagging (or its variant) in hard decision problems, which include estimation after testing in regression and decision trees for continuous regression functions and classiers. Hard decisions create instability, and bagging is shown to smooth such hard decisions yielding smaller variance and mean squared error. With theoretical explanations, we motivate subagging based on subsampling as an alternative aggregation scheme. It is computationally cheaper but still showing approximately the same accuracy as bagging. Moreover, our theory reveals improvements in rst order and in line with simulation studies; in contrast with the second-order explanation of Friedman and Hall (2000) for smooth functional..

CiteSeerX

Repository for Publications and Research Data