807 research outputs found

    Fitting multiplicative models by robust alternating regressions.

    Get PDF
    In this paper a robust approach for fitting multiplicative models is presented. Focus is on the factor analysis model, where we will estimate factor loadings and scores by a robust alternating regression algorithm. The approach is highly robust, and also works well when there are more variables than observations. The technique yields a robust biplot, depicting the interaction structure between individuals and variables. This biplot is not predetermined by outliers, which can be retrieved from the residual plot. Also provided is an accompanying robust R-2-plot to determine the appropriate number of factors. The approach is illustrated by real and artificial examples and compared with factor analysis based on robust covariance matrix estimators. The same estimation technique can fit models with both additive and multiplicative effects (FANOVA models) to two-way tables, thereby extending the median polish technique.Alternating regression; Approximation; Biplot; Covariance; Dispersion matrices; Effects; Estimator; Exploratory data analysis; Factor analysis; Factors; FANOVA; Least-squares; Matrix; Median polish; Model; Models; Outliers; Principal components; Robustness; Structure; Two-way table; Variables; Yield;

    Handling Outlier in Two-Ways Table by Robust Alternating Regression of FANOVA Models: Towards Robust AMMI Models

    Get PDF
    AMMI (Additive Main Effect Multiplicative Interaction) model for interactions in two-way table provide the major mean for studying stability and adaptability through genotype × environment interaction (GEI), which modeled by full interaction model. Eligibility of AMMI model depends on that assumption of normally independentdistributederrorwithaconstantvariance. Nowadays,AMMImodelshavebeendevelopedforany conditionofMETdatawhich violencethenormality,homogeneityassumpion. Wecanmentioninthisclassof modelling as M-AMMI for mixed AMMI models, G-AMMI for generalized AMMI models. The G-AMMI was handling non-normality i.e categorical response variables using an algorithm of alternating regression. While in handling the non-homogeneity in mix-models sense, one may use a model called factor analytic multiplicative. The development of AMMI models is also to handle any outlier that might be found coincides withnon-homogeneityconditionofthedata. Inthispaper,wewillpresentofhandlingoutlierinmultplicative model by robust approach of alternating regression algorithm

    Robust canonical correlations: a comparative study.

    Get PDF
    Several approaches for robust canonical correlation analysis will be presented and discussed. A first method is based on the definition of canonical correlation analysis as looking for linear combinations of two sets of variables having maximal (robust) correlation. A second method is based on alternating robust regressions. These methods are discussed in detail and compared with the more traditional approach to robust canonical correlation via covariance matrix estimates. A simulation study compares the performance of the different estimators under several kinds of sampling schemes. Robustness is studied as well by breakdown plots.Alternating regression; Canonical correlations; Correlation measures; Projection-pursuit; Robust covariance estimation; Robust regression; Robustness;

    Robust and sparse factor modelling.

    Get PDF
    Factor construction methods are widely used to summarize a large panel of variables by means of a relatively small number of representative factors. We propose a novel factor construction procedure that enjoys the properties of robustness to outliers and of sparsity; that is, having relatively few nonzero factor loadings. Compared to the traditional factor construction method, we find that this procedure leads to a favorable forecasting performance in the presence of outliers and to better interpretable factors. We investigate the performance of the method in a Monte Carlo experiment and in an empirical application to a large data set from macroeconomics.Dimension reduction; Forecasting; Outliers; Regularization; Sparsity;

    A case study of speculative financial bubbles in the South African stock market 2003-2006

    Full text link
    We tested 45 indices and common stocks traded in the South African stock market for the possible existence of a bubble over the period from Jan. 2003 to May 2006. A bubble is defined by a faster-than-exponential acceleration with significant log-periodic oscillations. The faster-than-exponential acceleration characteristics are tested with several different metrics, including nonlinearity on the logarithm of the price and power law fits. The log-periodic properties are investigated in detail using the first-order log-periodic power-law (LPPL) formula, the parametric detrending method, the (H,q)(H,q)-analysis, and the second-order Weierstrass-type model, resulting in a consistent and robust estimation of the fundamental angular log-frequency ω1=7±2\omega_1 =7\pm 2, in reasonable agreement with previous estimations on many other bubbles in developed and developing markets. Sensitivity tests of the estimated critical times and of the angular log-frequency are performed by varying the first date and the last date of the stock price time series. These tests show that the estimated parameters are robust. With the insight of 6 additional month of data since the analysis was performed, we observe that many of the stocks on the South Africa market experienced an abrupt drop mid-June 2006, which is compatible with the predicted tct_c for several of the stocks, but not all. This suggests that the mini-crash that occurred around mid-June of 2006 was only a partial correction, which has resumed into a renewed bubbly acceleration bound to end some times in 2007, similarly to what happened on the S&P500 US market from Oct. 1997 to Aug. 1998.Comment: 20 Latex pages including 10 figures + an appendix (1 table, 10 figures

    Robust Orthogonal Complement Principal Component Analysis

    Full text link
    Recently, the robustification of principal component analysis has attracted lots of attention from statisticians, engineers and computer scientists. In this work we study the type of outliers that are not necessarily apparent in the original observation space but can seriously affect the principal subspace estimation. Based on a mathematical formulation of such transformed outliers, a novel robust orthogonal complement principal component analysis (ROC-PCA) is proposed. The framework combines the popular sparsity-enforcing and low rank regularization techniques to deal with row-wise outliers as well as element-wise outliers. A non-asymptotic oracle inequality guarantees the accuracy and high breakdown performance of ROC-PCA in finite samples. To tackle the computational challenges, an efficient algorithm is developed on the basis of Stiefel manifold optimization and iterative thresholding. Furthermore, a batch variant is proposed to significantly reduce the cost in ultra high dimensions. The paper also points out a pitfall of a common practice of SVD reduction in robust PCA. Experiments show the effectiveness and efficiency of ROC-PCA in both synthetic and real data

    Sparse and Robust Factor Modelling

    Get PDF
    Factor construction methods are widely used to summarize a large panel of variables by means of a relatively small number of representative factors. We propose a novel factor construction procedure that enjoys the properties of robustness to outliers and of sparsity; that is, having relatively few nonzero factor loadings. Compared to more traditional factor construction methods, we find that this procedure leads to better interpretable factors and to a favorable forecasting performance, both in a Monte Carlo experiment and in two empirical applications to large data sets, one from macroeconomics and one from microeconomics

    Outlier Detection Using Nonconvex Penalized Regression

    Full text link
    This paper studies the outlier detection problem from the point of view of penalized regressions. Our regression model adds one mean shift parameter for each of the nn data points. We then apply a regularization favoring a sparse vector of mean shift parameters. The usual L1L_1 penalty yields a convex criterion, but we find that it fails to deliver a robust estimator. The L1L_1 penalty corresponds to soft thresholding. We introduce a thresholding (denoted by Θ\Theta) based iterative procedure for outlier detection (Θ\Theta-IPOD). A version based on hard thresholding correctly identifies outliers on some hard test problems. We find that Θ\Theta-IPOD is much faster than iteratively reweighted least squares for large data because each iteration costs at most O(np)O(np) (and sometimes much less) avoiding an O(np2)O(np^2) least squares estimate. We describe the connection between Θ\Theta-IPOD and MM-estimators. Our proposed method has one tuning parameter with which to both identify outliers and estimate regression coefficients. A data-dependent choice can be made based on BIC. The tuned Θ\Theta-IPOD shows outstanding performance in identifying outliers in various situations in comparison to other existing approaches. This methodology extends to high-dimensional modeling with pnp\gg n, if both the coefficient vector and the outlier pattern are sparse
    corecore