154 research outputs found

    Distortion in Correspondence Analysis and in Taxicab Correspondence Analysis: A Comparison

    Full text link
    Distortion is a fundamental well-studied topic in dimension reduction papers, and intimately related with the underlying intrinsic dimension of a mapping of a high dimensional data set onto a lower dimension. In this paper, we study embedding distortions produced by Correspondence Analysis and its robust l1 variant Taxicab Correspondence analysis, which are visualization methods for contingency tables. For high dimensional data, distortions in Correspondence Analysis are contractions; while distortions in Taxicab Correspondence Analysis could be contractions or stretchings. This shows that Euclidean geometry is quite rigid, because of the orthogonality property; while Taxicab geometry is quite flexible, because the orthogonality property is replaced by the conjugacy property.Comment: 18 pages, 4 figures, 4 table

    Scale Invariant Correspondence Analysis

    Full text link
    Correspondence analysis is a dimension reduction method for visualization of nonnegative data sets, in particular contingency tables ; but it depends on the marginals of the data set. Two transformations of the data have been proposed to render correspondence analysis row and column scales invariant : These two kinds of transformations change the initial form of the data set into a bistochastic form. The power transorfmation applied by Greenacre (2010) has one positive parameter. While the transormation applied by Mosteller (1968) and Goodman (1996) has (I+J) positive parameters, where the raw data is row and column scaled by the Sinkhorn (RAS or ipf) algorithm to render it bistochastic. Goodman (1996) named correspondence analsis of a bistochastic matrix marginal-free correspondence analysis. We discuss these two transformations, and further generalize Mosteller-Goodman approach.Comment: 22 pages, 3 figures, 3 table

    On the choice of weights in aggregate compositional data analysis

    Full text link
    In this paper, we distinguish between two kinds of compositional data sets: elementary and aggregate. This fact will help us to decide the choice of the weights to use in log interaction analysis of aggregate compositional vectors. We show that in the aggregate case, the underlying given data form a paired data sets composed of responses and qualitative covariates; this fact helps us to propose two approaches for analysis-visualization of data named log interaction of aggregates and aggregate of log interactions. Furthermore, we also show the first-order approximation of log interaction of a cell for different choices of the row and column weights.Comment: 3 figures, 1 table, 17 page

    Direct transformations yielding the knight's move pattern in 3x3x3 arrays

    Get PDF
    Three-way arrays (or tensors) can be regarded as extensions of the traditional two-way data matrices that have a third dimension. Studying algebraic properties of arrays is relevant, for example, for the Tucker three-way PCA method, which generalizes principal component analysis to three-way data. One important algebraic property of arrays is concerned with the possibility of transformations to simplicity. An array is said to be transformed to a simple form when it can be manipulated by a sequence of invertible operations such that a vast majority of its entries become zero. This paper shows how 3 × 3 × 3 arrays, whether symmetric or nonsymmetric, can be transformed to a simple form with 18 out of its 27 entries equal to zero. We call this simple form the “knight's move pattern” due to a loose resemblance to the moves of a knight in a game of chess. The pattern was examined by Kiers, Ten Berge, and Rocci. It will be shown how the knight's move pattern can be found by means of a numeric–algebraic procedure based on the Gröbner basis. This approach seems to work almost surely for randomly generated arrays, whether symmetric or nonsymmetric
    • …
    corecore