9,513 research outputs found

    Measuring influence in dynamic regression models

    Get PDF
    This article presents a methodology to build measures of influence in regression models with time series data. We introduce statistics that measure the influence of each observation on the parameter estimates and on the forecasts. These statistics take into account the autocorrelation of the sample. The first statistic can be decomposed to measure the change in the univariate ARIMA parameters, the transfer function parameters and the interaction between both. For independent data they reduce to the D statistics considered by Cook in the standard regression modelo These statistics can be easily computed using standard time series software. Their performance is analyzed in an example in which they seem to be useful to identify important events, such as additive outliers and trend shifts, in time series data

    Outliers in dynamic factor models

    Full text link
    Dynamic factor models have a wide range of applications in econometrics and applied economics. The basic motivation resides in their capability of reducing a large set of time series to only few indicators (factors). If the number of time series is large compared to the available number of observations then most information may be conveyed to the factors. This way low dimension models may be estimated for explaining and forecasting one or more time series of interest. It is desirable that outlier free time series be available for estimation. In practice, outlying observations are likely to arise at unknown dates due, for instance, to external unusual events or gross data entry errors. Several methods for outlier detection in time series are available. Most methods, however, apply to univariate time series while even methods designed for handling the multivariate framework do not include dynamic factor models explicitly. A method for discovering outliers occurrences in a dynamic factor model is introduced that is based on linear transforms of the observed data. Some strategies to separate outliers that add to the model and outliers within the common component are discussed. Applications to simulated and real data sets are presented to check the effectiveness of the proposed method.Comment: Published in at http://dx.doi.org/10.1214/07-EJS082 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

    SubCMap: subject and condition specific effect maps

    Get PDF
    Current methods for statistical analysis of neuroimaging data identify condition related structural alterations in the human brain by detecting group differences. They construct detailed maps showing population-wide changes due to a condition of interest. Although extremely useful, methods do not provide information on the subject-specific structural alterations and they have limited diagnostic value because group assignments for each subject are required for the analysis. In this article, we propose SubCMap, a novel method to detect subject and condition specific structural alterations. SubCMap is designed to work without the group assignment information in order to provide diagnostic value. Unlike outlier detection methods, SubCMap detections are condition-specific and can be used to study the effects of various conditions or for diagnosing diseases. The method combines techniques from classification, generalization error estimation and image restoration to the identify the condition-related alterations. Experimental evaluation is performed on synthetically generated data as well as data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Results on synthetic data demonstrate the advantages of SubCMap compared to population-wide techniques and higher detection accuracy compared to outlier detection. Analysis with the ADNI dataset show that SubCMap detections on cortical thickness data well correlate with non-imaging markers of Alzheimer's Disease (AD), the Mini Mental State Examination Score and Cerebrospinal Fluid amyloid-β levels, suggesting the proposed method well captures the inter-subject variation of AD effects

    Measuring influence in dynamic regression models.

    Get PDF
    This article presents a methodology to build measures of influence in regression models with time series data. We introduce statistics that measure the influence of each observation on the parameter estimates and on the forecasts. These statistics take into account the autocorrelation of the sample. The first statistic can be decomposed to measure the change in the univariate ARIMA parameters, the transfer function parameters and the interaction between both. For independent data they reduce to the D statistics considered by Cook in the standard regression modelo These statistics can be easily computed using standard time series software. Their performance is analyzed in an example in which they seem to be useful to identify important events, such as additive outliers and trend shifts, in time series data.Missing observation Missing observations; Outliers; Intervention analysis; ARIMA models; Inverse autocorrelation function;

    A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets

    Get PDF
    The term "outlier" can generally be defined as an observation that is significantly different from the other values in a data set. The outliers may be instances of error or indicate events. The task of outlier detection aims at identifying such outliers in order to improve the analysis of data and further discover interesting and useful knowledge about unusual events within numerous applications domains. In this paper, we report on contemporary unsupervised outlier detection techniques for multiple types of data sets and provide a comprehensive taxonomy framework and two decision trees to select the most suitable technique based on data set. Furthermore, we highlight the advantages, disadvantages and performance issues of each class of outlier detection techniques under this taxonomy framework

    A robust partial least squares method with applications

    Get PDF
    Partial least squares regression (PLS) is a linear regression technique developed to relate many regressors to one or several response variables. Robust methods are introduced to reduce or remove the effect of outlying data points. In this paper we show that if the sample covariance matrix is properly robustified further robustification of the linear regression steps of the PLS algorithm becomes unnecessary. The robust estimate of the covariance matrix is computed by searching for outliers in univariate projections of the data on a combination of random directions (Stahel-Donoho) and specific directions obtained by maximizing and minimizing the kurtosis coefficient of the projected data, as proposed by Peña and Prieto (2006). It is shown that this procedure is fast to apply and provides better results than other procedures proposed in the literature. Its performance is illustrated by Monte Carlo and by an example, where the algorithm is able to show features of the data which were undetected by previous methods
    • …
    corecore