3,283 research outputs found

    The contributions of rare objects in correspondence analysis

    Get PDF
    Correspondence analysis, when used to visualize relationships in a table of counts (for example, abundance data in ecology), has been frequently criticized as being too sensitive to objects (for example, species) that occur with very low frequency or in very few samples. In this statistical report we show that this criticism is generally unfounded. We demonstrate this in several data sets by calculating the actual contributions of rare objects to the results of correspondence analysis and canonical correspondence analysis, both to the determination of the principal axes and to the chi-square distance. It is a fact that rare objects are often positioned as outliers in correspondence analysis maps, which gives the impression that they are highly influential, but their low weight offsets their distant positions and reduces their effect on the results. An alternative scaling of the correspondence analysis solution, the contribution biplot, is proposed as a way of mapping the results in order to avoid the problem of outlying and low contributing rare objects.Biplot, canonical correspondence analysis, contribution, correspondence analysis, influence, outlier, scaling

    Canonical correspondence analysis in social science research

    Get PDF
    The use of simple and multiple correspondence analysis is well-established in social science research for understanding relationships between two or more categorical variables. By contrast, canonical correspondence analysis, which is a correspondence analysis with linear restrictions on the solution, has become one of the most popular multivariate techniques in ecological research. Multivariate ecological data typically consist of frequencies of observed species across a set of sampling locations, as well as a set of observed environmental variables at the same locations. In this context the principal dimensions of the biological variables are sought in a space that is constrained to be related to the environmental variables. This restricted form of correspondence analysis has many uses in social science research as well, as is demonstrated in this paper. We first illustrate the result that canonical correspondence analysis of an indicator matrix, restricted to be related an external categorical variable, reduces to a simple correspondence analysis of a set of concatenated (or “stacked”) tables. Then we show how canonical correspondence analysis can be used to focus on, or partial out, a particular set of response categories in sample survey data. For example, the method can be used to partial out the influence of missing responses, which usually dominate the results of a multiple correspondence analysis.Constraints, correspondence analysis, missing data, multiple correspondence

    Analysis of matched matrices

    Get PDF
    We consider the joint visualization of two matrices which have common rows and columns, for example multivariate data observed at two time points or split accord-ing to a dichotomous variable. Methods of interest include principal components analysis for interval-scaled data, or correspondence analysis for frequency data or ratio-scaled variables on commensurate scales. A simple result in matrix algebra shows that by setting up the matrices in a particular block format, matrix sum and difference components can be visualized. The case when we have more than two matrices is also discussed and the methodology is applied to data from the International Social Survey Program.Correspondence analysis, International Social Survey Program (ISSP), matched matrices, principal component analysis, singular-value decomposition

    Dynamic perceptual mapping

    Get PDF
    Perceptual maps have been used for decades by market researchers to illuminate them about the similarity between brands in terms of a set of attributes, to position consumers relative to brands in terms of their preferences, or to study how demographic and psychometric variables relate to consumer choice. Invariably these maps are two-dimensional and static. As we enter the era of electronic publishing, the possibilities for dynamic graphics are opening up. We demonstrate the usefulness of introducing motion into perceptual maps through four examples. The first example shows how a perceptual map can be viewed in three dimensions, and the second one moves between two analyses of the data that were collected according to different protocols. In a third example we move from the best view of the data at the individual level to one which focuses on between-group differences in aggregated data. A final example considers the case when several demographic variables or market segments are available for each respondent, showing an animation with increasingly detailed demographic comparisons. These examples of dynamic maps use several data sets from marketing and social science research.Animation, brand-attribute maps, correspondence analysis, multidimensional scaling, perceptual map, visualization

    Biplots of fuzzy coded data

    Get PDF
    A biplot, which is the multivariate generalization of the two-variable scatterplot, can be used to visualize the results of many multivariate techniques, especially those that are based on the singular value decomposition. We consider data sets consisting of continuous-scale measurements, their fuzzy coding and the biplots that visualize them, using a fuzzy version of multiple correspondence analysis. Of special interest is the way quality of fit of the biplot is measured, since it is well-known that regular (i.e., crisp) multiple correspondence analysis seriously under-estimates this measure. We show how the results of fuzzy multiple correspondence analysis can be defuzzified to obtain estimated values of the original data, and prove that this implies an orthogonal decomposition of variance. This permits a measure of fit to be calculated in the familiar form of a percentage of explained variance, which is directly comparable to the corresponding fit measure used in principal component analysis of the original data. The approach is motivated initially by its application to a simulated data set, showing how the fuzzy approach can lead to diagnosing nonlinear relationships, and finally it is applied to a real set of meteorological data.defuzzification, fuzzy coding, indicator matrix, measure of fit, multivariate data, multiple correspondence analysis, principal component analysis.

    Multiple correspondence analysis of a subset of response categories

    Get PDF
    In the analysis of multivariate categorical data, typically the analysis of questionnaire data, it is often advantageous, for substantive and technical reasons, to analyse a subset of response categories. In multiple correspondence analysis, where each category is coded as a column of an indicator matrix or row and column of Burt matrix, it is not correct to simply analyse the corresponding submatrix of data, since the whole geometric structure is different for the submatrix . A simple modification of the correspondence analysis algorithm allows the overall geometric structure of the complete data set to be retained while calculating the solution for the selected subset of points. This strategy is useful for analysing patterns of response amongst any subset of categories and relating these patterns to demographic factors, especially for studying patterns of particular responses such as missing and neutral responses. The methodology is illustrated using data from the International Social Survey Program on Family and Changing Gender Roles in 1994.Categorical data, correspondence analysis, questionnaire survey

    Computation of multiple correspondence analysis, with code in R

    Get PDF
    The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.Adjustment of principal inertias, Burt matrix, correspondence analysis, multiple correspondence analysis, R language, singular value decomposition, subset analysis

    A note on the dual scaling of dominance data and its relationship to correspondence analysis

    Get PDF
    Dual scaling of a subjects-by-objects table of dominance data (preferences, paired comparisons and successive categories data) has been contrasted with correspondence analysis, as if the two techniques were somehow different. In this note we show that dual scaling of dominance data is equivalent to the correspondence analysis of a table which is doubled with respect to subjects. We also show that the results of both methods can be recovered from a principal components analysis of the undoubled dominance table which is centred with respect to subject means.Correspondence analysis, dominance data, dual scaling, paired comparisons, preferences, principal component analysis, ratings

    Biplots of compositional data

    Get PDF
    The singular value decomposition and its interpretation as a linear biplot has proved to be a powerful tool for analysing many forms of multivariate data. Here we adapt biplot methodology to the speciffic case of compositional data consisting of positive vectors each of which is constrained to have unit sum. These relative variation biplots have properties relating to special features of compositional data: the study of ratios, subcompositions and models of compositional relationships. The methodology is demonstrated on a data set consisting of six-part colour compositions in 22 abstract paintings, showing how the singular value decomposition can achieve an accurate biplot of the colour ratios and how possible models interrelating the colours can be diagnosed.Logratio transformation, principal component analysis, relative variation biplot, singular value decomposition, subcomposition

    From correspondence analysis to multiple and joint correspondence analysis

    Get PDF
    The generalization of simple (two-variable) correspondence analysis to more than two categorical variables, commonly referred to as multiple correspondence analysis, is neither obvious nor well-defined. We present two alternative ways of generalizing correspondence analysis, one based on the quantification of the variables and intercorrelation relationships, and the other based on the geometric ideas of simple correspondence analysis. We propose a version of multiple correspondence analysis, with adjusted principal inertias, as the method of choice for the geometric definition, since it contains simple correspondence analysis as an exact special case, which is not the situation of the standard generalizations. We also clarify the issue of supplementary point representation and the properties of joint correspondence analysis, a method that visualizes all two-way relationships between the variables. The methodology is illustrated using data on attitudes to science from the International Social Survey Program on Environment in 1993.Correspondence analysis, eigendecomposition, joint correspondence analysis, multivariate categorical data, questionnaire data, singular value decomposition
    corecore