1,095 research outputs found

    Robust principal component analysis for functional data.

    Get PDF
    dispersion matrices;

    Sparse canonical correlation analysis from a predictive point of view

    Full text link
    Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each data set. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for sparse CCA. Sparse estimation produces linear combinations of only a subset of variables from each data set, thereby increasing the interpretability of the canonical variates. We consider the CCA problem from a predictive point of view and recast it into a regression framework. By combining an alternating regression approach together with a lasso penalty, we induce sparsity in the canonical vectors. We compare the performance with other sparse CCA techniques in different simulation settings and illustrate its usefulness on a genomic data set

    Maximum deviation curves for location estimators.

    Get PDF
    The maximum deviation curve of an estimator describes how an estimate can change (in the worst case) when you replace m out of n ''good'' observations to arbitrary positions. This function will be computed for some robust univariate location estimators. A lower bound for this curve is derived, and it is shown that this bound can be attained Trimmed means will always be close to this lower bound. When more than one third of the observations is contaminated, the median also gets to the lower bound. Finally, it is shown that a high breakdown point leads to a relatively large maximum deviation in the presence of small amounts of contaminants. The maximum deviation curve approach is based on the finite sample behavior of the estimators and makes no distributional assumptions.bias curve; breakdown point; location estimator; maximum deviation; robustness; sensitivity; regression;

    Robust Sparse Canonical Correlation Analysis

    Full text link
    Canonical correlation analysis (CCA) is a multivariate statistical method which describes the associations between two sets of variables. The objective is to find linear combinations of the variables in each data set having maximal correlation. This paper discusses a method for Robust Sparse CCA. Sparse estimation produces canonical vectors with some of their elements estimated as exactly zero. As such, their interpretability is improved. We also robustify the method such that it can cope with outliers in the data. To estimate the canonical vectors, we convert the CCA problem into an alternating regression framework, and use the sparse Least Trimmed Squares estimator. We illustrate the good performance of the Robust Sparse CCA method in several simulation studies and two real data examples

    Sparse cointegration

    Full text link
    Cointegration analysis is used to estimate the long-run equilibrium relations between several time series. The coefficients of these long-run equilibrium relations are the cointegrating vectors. In this paper, we provide a sparse estimator of the cointegrating vectors. The estimation technique is sparse in the sense that some elements of the cointegrating vectors will be estimated as zero. For this purpose, we combine a penalized estimation procedure for vector autoregressive models with sparse reduced rank regression. The sparse cointegration procedure achieves a higher estimation accuracy than the traditional Johansen cointegration approach in settings where the true cointegrating vectors have a sparse structure, and/or when the sample size is low compared to the number of time series. We also discuss a criterion to determine the cointegration rank and we illustrate its good performance in several simulation settings. In a first empirical application we investigate whether the expectations hypothesis of the term structure of interest rates, implying sparse cointegrating vectors, holds in practice. In a second empirical application we show that forecast performance in high-dimensional systems can be improved by sparsely estimating the cointegration relations

    Robust regression in Stata.

    Get PDF
    In regression analysis, the presence of outliers in the data set can strongly distort the classical least squares estimator and lead to unreliable results. To deal with this, several robust-to-outliers methods have been proposed in the statistical literature. In Stata, some of these methods are available through the commands rreg and qreg. Unfortunately, these methods only resist to some specific types of outliers and turn out to be ineffective under alternative scenarios. In this paper we present more effective robust estimators that we implemented in Stata. We also present a graphical tool that allows recognizing the type of existing outliers.S-estimators; MM-estimators; Outliers; Robustness;

    Estimators of the multiple correlation coefficient: local robustness and confidence intervals.

    Get PDF
    Many robust regression estimators are defined by minimizing a measure of spread of the residuals. An accompanying R-2-measure, or multiple correlation coefficient, is then easily obtained. In this paper, local robustness properties of these robust R-2-coefficients axe investigated. It is also shown how confidence intervals for the population multiple correlation coefficient can be constructed in the case of multivariate normality.Cautionary note; High breakdown-point; Influence function; Intervals; Model; Multiple correlation coefficient; R-2-measure; Regression analysis; Residuals; Robustness; Squares regression;

    Do stock prices contain predictive power for the future economic activity? A Granger causality analysis in the frequency domain.

    Get PDF
    This paper investigates the predictive power for future domestic economic activity included in domestic stock prices, using a Granger causality analysis in the frequency domain. We are able to evaluate whether the predictive power is concentrated at the slowly fluctuating components or at the quickly fluctuating components. Using quarterly data for the G-7 countries, we found that the slowly fluctuating components of the stock prices have large predictive power for the future GDP, while this is not the case for the quickly fluctuating components. This finding holds both in a single-country setting and in a multi-country setting. Therefore, macro-economic policy makers could use the slowly fluctuating components of the stock prices to improve their predictions of the future GDP.Frequency domain; Granger causality; Gross domestic product; Predictive power; Stock prices;

    Robust M-estimation of multivariate conditionally heteroscedastic time series models with elliptical innovations.

    Get PDF
    This paper proposes new methods for the econometric analysis of outlier contaminated multivariate conditionally heteroscedastic time series. Robust alternatives to the Gaussian quasi-maximum likelihood estimator are presented. Under elliptical symmetry of the innovation vector, consistency results for M-estimation of the general conditional heteroscedasticity model are obtained. We also propose a robust estimator for the cross-correlation matrix and a diagnostic check for correct specification of the innovation density function. In a Monte Carlo experiment, the effect of outliers on different types of M-estimators is studied. We conclude with a financial application in which these new tools are used to analyse and estimate the symmetric BEKK model for the 1980-2006 series of weekly returns on the Nasdaq and NYSE composite indices. For this dataset, robust estimators are needed to cope with the outlying returns corresponding to the stock market crash in 1987 and the burst of the dotcombubble in 2000.Concitional heteroscedasticity; M-estimators; Multivariate time series; Outliers; Quasi-maximum likelihood; Robust methods;
    corecore