9,513 research outputs found
Measuring influence in dynamic regression models
This article presents a methodology to build measures of influence in regression models with time series data. We introduce statistics that measure the influence of each observation on the parameter estimates and on the forecasts. These statistics take into account the autocorrelation of the sample. The first statistic can be decomposed to measure the change in the univariate ARIMA parameters, the transfer function parameters and the interaction between both. For independent data they reduce to the D statistics considered by Cook in the standard regression modelo These statistics can be easily computed using standard time series software. Their performance is analyzed in an example in which they seem to be useful to identify important events, such as additive outliers and trend shifts, in time series data
Outliers in dynamic factor models
Dynamic factor models have a wide range of applications in econometrics and
applied economics. The basic motivation resides in their capability of reducing
a large set of time series to only few indicators (factors). If the number of
time series is large compared to the available number of observations then most
information may be conveyed to the factors. This way low dimension models may
be estimated for explaining and forecasting one or more time series of
interest. It is desirable that outlier free time series be available for
estimation. In practice, outlying observations are likely to arise at unknown
dates due, for instance, to external unusual events or gross data entry errors.
Several methods for outlier detection in time series are available. Most
methods, however, apply to univariate time series while even methods designed
for handling the multivariate framework do not include dynamic factor models
explicitly. A method for discovering outliers occurrences in a dynamic factor
model is introduced that is based on linear transforms of the observed data.
Some strategies to separate outliers that add to the model and outliers within
the common component are discussed. Applications to simulated and real data
sets are presented to check the effectiveness of the proposed method.Comment: Published in at http://dx.doi.org/10.1214/07-EJS082 the Electronic
Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of
Mathematical Statistics (http://www.imstat.org
SubCMap: subject and condition specific effect maps
Current methods for statistical analysis of neuroimaging data identify condition related structural alterations in the human brain by detecting group differences. They construct detailed maps showing population-wide changes due to a condition of interest. Although extremely useful, methods do not provide information on the subject-specific structural alterations and they have limited diagnostic value because group assignments for each subject are required for the analysis. In this article, we propose SubCMap, a novel method to detect subject and condition specific structural alterations. SubCMap is designed to work without the group assignment information in order to provide diagnostic value. Unlike outlier detection methods, SubCMap detections are condition-specific and can be used to study the effects of various conditions or for diagnosing diseases. The method combines techniques from classification, generalization error estimation and image restoration to the identify the condition-related alterations. Experimental evaluation is performed on synthetically generated data as well as data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. Results on synthetic data demonstrate the advantages of SubCMap compared to population-wide techniques and higher detection accuracy compared to outlier detection. Analysis with the ADNI dataset show that SubCMap detections on cortical thickness data well correlate with non-imaging markers of Alzheimer's Disease (AD), the Mini Mental State Examination Score and Cerebrospinal Fluid amyloid-β levels, suggesting the proposed method well captures the inter-subject variation of AD effects
Measuring influence in dynamic regression models.
This article presents a methodology to build measures of influence in regression models with time series data. We introduce statistics that measure the influence of each observation on the parameter estimates and on the forecasts. These statistics take into account the autocorrelation of the sample. The first statistic can be decomposed to measure the change in the univariate ARIMA parameters, the transfer function parameters and the interaction between both. For independent data they reduce to the D statistics considered by Cook in the standard regression modelo These statistics can be easily computed using standard time series software. Their performance is analyzed in an example in which they seem to be useful to identify important events, such as additive outliers and trend shifts, in time series data.Missing observation Missing observations; Outliers; Intervention analysis; ARIMA models; Inverse autocorrelation function;
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
A robust partial least squares method with applications
Partial least squares regression (PLS) is a linear regression technique developed to relate many
regressors to one or several response variables. Robust methods are introduced to reduce or
remove the effect of outlying data points. In this paper we show that if the sample covariance
matrix is properly robustified further robustification of the linear regression steps of the PLS
algorithm becomes unnecessary. The robust estimate of the covariance matrix is computed by
searching for outliers in univariate projections of the data on a combination of random directions
(Stahel-Donoho) and specific directions obtained by maximizing and minimizing the kurtosis
coefficient of the projected data, as proposed by Peña and Prieto (2006). It is shown that this
procedure is fast to apply and provides better results than other procedures proposed in the
literature. Its performance is illustrated by Monte Carlo and by an example, where the algorithm is
able to show features of the data which were undetected by previous methods
Identification of unusual events in multi-channel bridge monitoring data
Peer reviewedPostprin
- …