500 research outputs found

    A procedure for robust estimation and diagnostics in regression

    Get PDF
    We propose a new procedure for computing an approximation to regression estimates based on the minimization of a robust scale. The procedure can be applied with a large number of independent variables where the usual methods based on resampling require an unfeasible or extremely costly computer time. An important advantage of the procedure is that it can be incorporated in any high breakdown procedure and improve it with just a few seconds of computer time. The procedure minimizes the robust scale over a set of tentative parameter vectors. Each of these parameter vector is obtained as follows. We represent each data point by the vector of changes of the least squares forecasts of that observation, when each of the observations is deleted. Then the sets of possible outliers are obtained as the extreme points of the principal components of these vectors, or as the set of points with large residuals. The good performance of the procedure allows the identification of multiple outliers avoiding masking effects. The efficiency of the procedure for robust estimation and its power as an outlier detection tool are investigated in a simulation study and some examples

    Common large innovations across nonlinear time series

    Get PDF
    We propose a multivariate nonlinear econometric time series model, which can be used to examine if there is common nonlinearity across economic variables. The model is a multivariate censored latent effects autoregression. The key feature of this model is that nonlinearity appears as separate innovation-like variables. Common nonlinearity can then be easily defined as the presence of common innovations. We discuss representation, inference, estimation and diagnostics. We illustrate the model for US and Canadian unemployment and find that US innovation variables have an effect on Canadian unemployment, and not the other way around, and also that there is no common nonlinearity across the unemployment variables

    Energy Consumption, Survey Data and the Prediction of Industrial Production in Italy

    Get PDF
    We investigate the prediction of Italian industrial production. We first specify a model based on electricity consumption; we show that the cubic trend in such a model mostly captures the evolution over time of the electricity coefficient, which can be well approximated by a smooth transition model à la Terasvirta, with no gains in predictive power, though. We also analyze the performance of models based on data of different business surveys. According to basic statistics of forecasting accuracy, the linear energy-based model is not outperformed by any other single model, neither by a combination of forecasts. However, a more comprehensive set of evaluation criteria sheds light on the advantages of using the whole information available. Overall, the best forecasting performance is achieved by estimating a combined model which includes among regressors both energy consumption and survey data.Italy, industrial production, energy

    Forecasting Industrial Production in the Euro Area

    Get PDF
    The creation of the Euro area has increased the importance of obtaining timely information about short-term changes in the area's real activity. In this paper we propose a number of alternative short-term forecasting models, ranging from simple ARIMA models to more complex cointegrated VAR and conditional models, to forecast the index of industrial production in the euro area. A conditional error-correction model in which the aggregate index of industrial production for the area is explained by the US industrial production index and the business confidence index from the European Commission harmonised survey on manufacturing firms achieves the best score in terms of forecasting capacity.

    Coupled continuous time random walks in finance

    Get PDF
    Continuous time random walks (CTRWs) are used in physics to model anomalous diffusion, by incorporating a random waiting time between particle jumps. In finance, the particle jumps are log-returns and the waiting times measure delay between transactions. These two random variables (log-return and waiting time) are typically not independent. For these coupled CTRW models, we can now compute the limiting stochastic process (just like Brownian motion is the limit of a simple random walk), even in the case of heavy tailed (power-law) price jumps and/or waiting times. The probability density functions for this limit process solve fractional partial differential equations. In some cases, these equations can be explicitly solved to yield descriptions of long-term price changes, based on a high-resolution model of individual trades that includes the statistical dependence between waiting times and the subsequent log-returns. In the heavy tailed case, this involves operator stable space-time random vectors that generalize the familiar stable models. In this paper, we will review the fundamental theory and present two applications with tick-by-tick stock and futures data.Comment: 7 pages, 2 figures. Paper presented at the Econophysics Colloquium, Canberra, Australia, November 200

    "Do the Innovations in a Monetary VAR Have Finite Variances?"

    Get PDF
    Since Christopher Sims's "Macroeconomics and Reality" (1980), macroeconomists have used structural VARs, or vector autoregressions, for policy analysis. Constructing the impulse-response functions and variance decompositions that are central to this literature requires factoring the variance-covariance matrix of innovations from the VAR. This paper presents evidence consistent with the hypothesis that at least some elements of this matrix are infinite for one monetary VAR, as the innovations have stable, non-Gaussian distributions, with characteristic exponents ranging from 1.5504 to 1.7734 according to ML estimates. Hence, Cholesky and other factorizations that would normally be used to identify structural residuals from the VAR are impossible.

    Knowing when you're wrong: Building fast and reliable approximate query processing systems

    Get PDF
    Modern data analytics applications typically process massive amounts of data on clusters of tens, hundreds, or thousands of machines to support near-real-time decisions.The quantity of data and limitations of disk and memory bandwidth often make it infeasible to deliver answers at interactive speeds. However, it has been widely observed that many applications can tolerate some degree of inaccuracy. This is especially true for exploratory queries on data, where users are satisfied with "close-enough" answers if they can come quickly. A popular technique for speeding up queries at the cost of accuracy is to execute each query on a sample of data, rather than the whole dataset. To ensure that the returned result is not too inaccurate, past work on approximate query processing has used statistical techniques to estimate "error bars" on returned results. However, existing work in the sampling-based approximate query processing (S-AQP) community has not validated whether these techniques actually generate accurate error bars for real query workloads. In fact, we find that error bar estimation often fails on real world production workloads. Fortunately, it is possible to quickly and accurately diagnose the failure of error estimation for a query. In this paper, we show that it is possible to implement a query approximation pipeline that produces approximate answers and reliable error bars at interactive speeds.National Science Foundation (U.S.) (CISE Expeditions Award CCF-1139158)Lawrence Berkeley National Laboratory (Award 7076018)United States. Defense Advanced Research Projects Agency (XData Award FA8750-12-2-0331)Amazon.com (Firm)Google (Firm)SAP CorporationThomas and Stacey Siebel FoundationApple Computer, Inc.Cisco Systems, Inc.Cloudera, Inc.EMC CorporationEricsson, Inc.Facebook (Firm
    • 

    corecore