396 research outputs found
A unified approach to structural change tests based on F statistics, OLS residuals, and ML scores
Three classes of structural change tests (or tests for parameter instability) which have been receiving much attention in both the statistics and econometrics communities but have been developed in rather loosely connected lines of research are unified by embedding them into the framework of generalized M-fluctuation tests (Zeileis and Hornik, 2003). These classes are tests based on F statistics (supF, aveF, expF tests), on OLS residuals (OLS-based CUSUM and MOSUM tests) and on maximum likelihood scores (including the Nyblom-Hansen test). We show that (represantives from) these classes are special cases of the generalized M-fluctuation tests, based on the same functional central limit theorem, but employing different functionals for capturing excessive fluctuations. After embedding these tests into the same framework and thus understanding the relationship between these procedures for testing in historical samples, it is shown how the tests can also be extended to a monitoring situation. This is achieved by establishing a general M-fluctuation monitoring procedure and then applying the different functionals corresponding to monitoring with F statistics, OLS residuals and ML scores. In particular, an extension of the supF test to a monitoring scenario is suggested and illustrated on a real-world data set.Series: Research Report Series / Department of Statistics and Mathematic
Econometric Computing with HC and HAC Covariance Matrix Estimators
Data described by econometric models typically contains autocorrelation and/or heteroskedasticity of unknown form and for inference in such models it is essential to use covariance matrix estimators that can consistently estimate the covariance of the model parameters. Hence, suitable heteroskedasticity consistent (HC) and heteroskedasticity and autocorrelation consistent (HAC) estimators have been receiving attention in the econometric literature over the last 20 years. To apply these estimators in practice, an implementation is needed that preferably translates the conceptual properties of the underlying theoretical frameworks into computational tools. In this paper, such an implementation in the package sandwich in the R system for statistical computing is described and it is shown how the suggested functions provide reusable components that build on readily existing functionality and how they can be integrated easily into new inferential procedures or applications. The toolbox contained in sandwich is extremely flexible and comprehensive, including specific functions for the most important HC and HAC estimators from the econometric literature. Several real-world data sets are used to illustrate how the functionality can be integrated into applications.
Object-oriented Computation of Sandwich Estimators
Sandwich covariance matrix estimators are a popular tool in applied regression modeling for performing inference that is robust to certain types of model misspecification. Suitable implementations are available in the R system for statistical computing for certain model fitting functions only (in particular lm()), but not for other standard regression functions, such as glm(), nls(), or survreg(). Therefore, conceptual tools and their translation to computational tools in the package sandwich are discussed, enabling the computation of sandwich estimators in general parametric models. Object orientation can be achieved by providing a few extractor functions' most importantly for the empirical estimating functions' from which various types of sandwich estimators can be computed.
Reproducible Econometric Research. A Critical Review of the State of the Art.
Recent software developments are reviewed from the vantage point of reproducible econometric research. We argue that the emergence of new tools, particularly in the open-source community, have greatly eased the burden of documenting and archiving both empirical and simulation work in econometrics. Some of these tools are highlighted in the discussion of three small replication exercises.Series: Research Report Series / Department of Statistics and Mathematic
Danger: High Power! – Exploring the Statistical Properties of a Test for Random Forest Variable Importance
Random forests have become a widely-used predictive model in many scientific disciplines within the past few years. Additionally, they are increasingly popular for assessing variable importance, e.g., in genetics and bioinformatics. We highlight both advantages and limitations of different variable importance scores and associated testing procedures, especially in the context of correlated predictor variables. For the test of Breiman and Cutler (2008), we investigate the statistical properties and find that the power of the test depends both on the sample size
and the number of trees, an arbitrarily chosen tuning parameter, leading to undesired results that nullify any significance judgments. Moreover, the specification of the null hypothesis of this test is discussed in the context of correlated predictor variables
Extended Model Formulas in R. Multiple Parts and Multiple Responses.
Model formulas are the standard approach for specifying the variables in statistical models in the S language. Although being eminently useful in an extremely wide class of applications, they have certain limitations including being confined to single responses and not providing convenient support for processing formulas with multiple parts. The latter is relevant for models with two or more sets of variable, e.g., regressors/instruments in instrumental variable regressions, two-part models such as hurdle models, or alternative-specific and individual-specific variables in choice models among many others. The R package Formula addresses these two problems by providing a new class "Formula" (inheriting from "formula") that accepts an additional formula operator | separating multiple parts and by allowing all formula operators (including the new |) on the left-hand side to support multiple responses.Series: Research Report Series / Department of Statistics and Mathematic
Extended Model Formulas in R: Multiple Parts and Multiple Responses
Model formulas are the standard approach for specifying the variables in statistical models in the S language. Although being eminently useful in an extremely wide class of applications, they have certain limitations including being confined to single responses and not providing convenient support for processing formulas with multiple parts. The latter is relevant for models with two or more sets of variables, e.g., different equations for different model parameters (such as mean and dispersion), regressors and instruments in instrumental variable regressions, two-part models such as hurdle models, or alternative-specific and individual-specific variables in choice models among many others. The R package Formula addresses these two problems by providing a new class âÂÂFormulaâ (inheriting from âÂÂformulaâÂÂ) that accepts an additional formula operator | separating multiple parts and by allowing all formula operators (including the new |) on the left-hand side to support multiple responses.
Structural Breaks in Inflation Dynamics within the European Monetary Union
To assess the effects of the EMU on inflation rate dynamics of its member states, the inflation rate series for 21 European countries are investigated for structural changes. To capture changes in mean, variance, and skewness of inflation rates, a generalized logistic model is adopted and complemented with structural break tests and breakpoint estimation techniques. These reveal considerable differences in the patterns of inflation dynamics and the structural changes therein. Overall, there is a convergence towards a lower mean inflation rate with reduced skewness, but it is accompanied by an increase in variance.inflation rate, structural break, EMU, generalized logistic distribution
Validating multiple structural change models : A case study
In a recent article, Bai and Perron (2003, Journal of Applied Econometrics) present a comprehensive discussion of computational aspects of multiple structural change models along with several empirical examples. Here, we report on the results of a replication study using the R statistical software package. We are able to verify most of their findings; however, some confidence intervals associated with breakpoints cannot be reproduced. These confidence intervals require computation of the quantiles of a nonstandard distribution, the distribution of the argmax functional of a certain stochastic process. Interestingly, the difficulties appear to be due to numerical problems in GAUSS, the software package used by Bai and Perron. --structural change,breakpoints,econometric software,numerical accuracy,reproducibility,R,GAUSS
- …