3,030 research outputs found
Combining information in statistical modelling
How to combine information from different sources is becoming an important statistical area of research under the name of Meta Analysis. This paper shows that the estimation of a parameter or the forecast of a random variable can also be seen as a process of combining information. It is shown that this approach can provide sorne useful insights on the robustness properties of sorne statistical procedures, and it also allows the comparison of statistical models within a common framework. Sorne general combining rules are illustrated using examples from ANOVA analysis, diagnostics in regression, time series forecasting, missing value estimation and recursive estimation using the Kalman Filter
Measuring influence in dynamic regression models
This article presents a methodology to build measures of influence in regression models with time series data. We introduce statistics that measure the influence of each observation on the parameter estimates and on the forecasts. These statistics take into account the autocorrelation of the sample. The first statistic can be decomposed to measure the change in the univariate ARIMA parameters, the transfer function parameters and the interaction between both. For independent data they reduce to the D statistics considered by Cook in the standard regression modelo These statistics can be easily computed using standard time series software. Their performance is analyzed in an example in which they seem to be useful to identify important events, such as additive outliers and trend shifts, in time series data
Forecasting growth with time series models
This paper compares the structure of three models for estimating future growth in a time series. It is shown that a regression model gives minimum weight to the last observed growth and maximum weight to the observed growth in the middle of the sample period. A first order integrated ARIMA model, or I(1) model, gives uniform weights to all observed growths. Finally, a second order integrated ARIMA model gives maximum weights to the last observed gro~1h andı minimum weights to the observed growths at the beginning of the sample period
Measuring influence in dynamic regression models.
This article presents a methodology to build measures of influence in regression models with time series data. We introduce statistics that measure the influence of each observation on the parameter estimates and on the forecasts. These statistics take into account the autocorrelation of the sample. The first statistic can be decomposed to measure the change in the univariate ARIMA parameters, the transfer function parameters and the interaction between both. For independent data they reduce to the D statistics considered by Cook in the standard regression modelo These statistics can be easily computed using standard time series software. Their performance is analyzed in an example in which they seem to be useful to identify important events, such as additive outliers and trend shifts, in time series data.Missing observation Missing observations; Outliers; Intervention analysis; ARIMA models; Inverse autocorrelation function;
Comparing probabilistic methods for outlier detection
This paper compares the use of two posterior probability methods to deal with outliers in linear models. We show that putting together diagnostics that come from the mean-shift and variance-shift models yields a procedure that seems to be more effective than the use of probabilities computed from the posterior distributions of actual realized residuals. The relation of the suggested procedure to the use of a certain predictive distribution for diagnostics is derived
Multivariate analysis in vector time series
This paper reviews the applications of classical multivariate techniques for discrimination, clustering and dimension reduction for time series data. It is shown that the discrimination problem can be seen as a model selection problem. Some of the results obtained in the time domain are reviewed. Clustering time series requires the definition of an adequate metric between univariate time series and several possible metrics are analyzed. Dimension reduction has been a very active line of research in the time series literature and the dynamic principal components or canonical analysis of Box and Tiao (1977) and the factor model as developed by Peña and Box (1987) and Peña and Poncela (1998) are analyzed. The relation between the nonstationary factor model and the cointegration literature is also reviewed
Gibbs sampling will fail in outlier problems with strong masking
This paper discusses the convergence of the Gibbs sampling algorithm when it is applied to the problem of outlier detection in regression models. Given any vector of initial conditions, theoretically, the algorithm converges to the true posterior distribution. However, the speed of convergence may slow down in a high dimensional parameter space where the parameters are highly correlated. We show that the effect of the leverage in regression models makes very difficult the convergence of the Gibbs sampling algorithm in sets of data with strong masking. The problem is illustrated in several examples
On bayesian robustness: an asymptotic approach
This paper presents a new asymptotic approach to study the robustness of Bayesian inference to changes on the prior distribution. We study the robustness of the posterior density score function when the uncertainty about the prior distribution has been restated as a problem of uncertainty about the model parametrization. Classical robustness tools, such as the influence function and the maximum bias function, are defined for uniparametric models and calculated for the location case. Possible extensions to other models are also briefly discussed
Interpolation, outliers and inverse autocorrelations
The paper addresses the problem of estimating missing observations in linear, possibly nonstationary, stochastic processes when the model is known. The general case of any possible distribution of missing observations in the time series is considered, and analytical expressions for the optimal estimators and their associated mean squared errors are obtained. These expressions involve solely the elements of the inverse or dual autocorrelation function of the series.
This optimal estimator -the conditional expectation of the missing observations given the available ones-is equal oto the estimator that results from filling the missing values in the series with arbitrary numbers, treating these numbers as additive outliers, and removing the outlier effects from the invented numbers using intervention analysis
Model selection criteria and quadratic discrimination in ARMA and SETAR time series models
We show that analyzing model selection in ARMA time series models as a quadratic discrimination problem provides a unifying approach for deriving model selection criteria. Also this approach suggest a different definition of expected likelihood that the one proposed by Akaike. This approach leads to including a correction term in the criteria which does not modify their large sample performance but can produce significant improvement in the performance of the criteria in small samples. Thus we propose a family of criteria which generalizes the commonly used model selection criteria. These ideas can be extended to self exciting autoregressive models (SETAR) and we generalize the proposed approach for these non linear time series models. A Monte-Carlo study shows that this family improves the finite sample performance of criteria such as AIC, corrected AIC and BIC, for ARMA models, and AIC, corrected AIC, BIC and some cross-validation criteria for SETAR models. In particular, for small and medium sample size the frequency of selecting the true model improves for the consistent criteria and the root mean square error of prediction improves for the efficient criteria. These results are obtained for both linear ARMA models and SETAR models in which we assume that the threshold and the parameters are unknown
- …