35,471 research outputs found

    From correspondence analysis to multiple and joint correspondence analysis

    Get PDF
    The generalization of simple (two-variable) correspondence analysis to more than two categorical variables, commonly referred to as multiple correspondence analysis, is neither obvious nor well-defined. We present two alternative ways of generalizing correspondence analysis, one based on the quantification of the variables and intercorrelation relationships, and the other based on the geometric ideas of simple correspondence analysis. We propose a version of multiple correspondence analysis, with adjusted principal inertias, as the method of choice for the geometric definition, since it contains simple correspondence analysis as an exact special case, which is not the situation of the standard generalizations. We also clarify the issue of supplementary point representation and the properties of joint correspondence analysis, a method that visualizes all two-way relationships between the variables. The methodology is illustrated using data on attitudes to science from the International Social Survey Program on Environment in 1993.Correspondence analysis, eigendecomposition, joint correspondence analysis, multivariate categorical data, questionnaire data, singular value decomposition

    Analisis korespondensi dua variabel

    Get PDF
    ABSTRAK Analisis korespondensi dua variabel mempelajari hubungan dua variabel kualitatif. Data kualitatif dalam analysis korespondensi disajikan dalam bentuk tabulasi silang. Selanjutnya Analisis korespondensi akan menentukan kemiripan variabel, revresentasi dan interpretasi masing-masing variabel. Correspondence analysis of two variables studied correlation of two qualytatif variables. In the correspondence analysis data qualytatif presented in the contingensi tables forms. Therefore correspondence analysis determinated similary variables representation and interpretation each variables

    Computation of multiple correspondence analysis, with code in R

    Get PDF
    The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.Adjustment of principal inertias, Burt matrix, correspondence analysis, multiple correspondence analysis, R language, singular value decomposition, subset analysis

    The contributions of rare objects in correspondence analysis

    Get PDF
    Correspondence analysis, when used to visualize relationships in a table of counts (for example, abundance data in ecology), has been frequently criticized as being too sensitive to objects (for example, species) that occur with very low frequency or in very few samples. In this statistical report we show that this criticism is generally unfounded. We demonstrate this in several data sets by calculating the actual contributions of rare objects to the results of correspondence analysis and canonical correspondence analysis, both to the determination of the principal axes and to the chi-square distance. It is a fact that rare objects are often positioned as outliers in correspondence analysis maps, which gives the impression that they are highly influential, but their low weight offsets their distant positions and reduces their effect on the results. An alternative scaling of the correspondence analysis solution, the contribution biplot, is proposed as a way of mapping the results in order to avoid the problem of outlying and low contributing rare objects.Biplot, canonical correspondence analysis, contribution, correspondence analysis, influence, outlier, scaling

    Canonical correspondence analysis in social science research

    Get PDF
    The use of simple and multiple correspondence analysis is well-established in social science research for understanding relationships between two or more categorical variables. By contrast, canonical correspondence analysis, which is a correspondence analysis with linear restrictions on the solution, has become one of the most popular multivariate techniques in ecological research. Multivariate ecological data typically consist of frequencies of observed species across a set of sampling locations, as well as a set of observed environmental variables at the same locations. In this context the principal dimensions of the biological variables are sought in a space that is constrained to be related to the environmental variables. This restricted form of correspondence analysis has many uses in social science research as well, as is demonstrated in this paper. We first illustrate the result that canonical correspondence analysis of an indicator matrix, restricted to be related an external categorical variable, reduces to a simple correspondence analysis of a set of concatenated (or “stacked”) tables. Then we show how canonical correspondence analysis can be used to focus on, or partial out, a particular set of response categories in sample survey data. For example, the method can be used to partial out the influence of missing responses, which usually dominate the results of a multiple correspondence analysis.Constraints, correspondence analysis, missing data, multiple correspondence

    Subset correspondence analysis: Visualizing relationships among a selected set of response categories from a questionnaire survey

    Get PDF
    It is shown how correspondence analysis may be applied to a subset of response categories from a questionnaire survey, for example the subset of undecided responses or the subset of responses for a particular category. The idea is to maintain the original relative frequencies of the categories and not re-express them relative to totals within the subset, as would normally be done in a regular correspondence analysis of the subset. Furthermore, the masses and chi-square metric assigned to the data subset are the same as those in the correspondence analysis of the whole data set. This variant of the method, called Subset Correspondence Analysis, is illustrated on data from the ISSP survey on Family and Changing Gender Roles.Categorical data, correspondence analysis, questionnaire survey

    Power transformations in correspondence analysis

    Get PDF
    Power transformations of positive data tables, prior to applying the correspondence analysis algorithm, are shown to open up a family of methods with direct connections to the analysis of log-ratios. Two variations of this idea are illustrated. The first approach is simply to power the original data and perform a correspondence analysis – this method is shown to converge to unweighted log-ratio analysis as the power parameter tends to zero. The second approach is to apply the power transformation to the contingency ratios, that is the values in the table relative to expected values based on the marginals – this method converges to weighted log-ratio analysis, or the spectral map. Two applications are described: first, a matrix of population genetic data which is inherently two-dimensional, and second, a larger cross-tabulation with higher dimensionality, from a linguistic analysis of several books.Box-Cox transformation, chi-square distance, contingency ratio, correspondence analysis, log-ratio analysis, power transformation, ratio data, singular value decomposition, spectral map

    Correspondence analysis of raw data

    Get PDF
    Correspondence analysis has found extensive use in ecology, archeology, linguistics and the social sciences as a method for visualizing the patterns of association in a table of frequencies or nonnegative ratio-scale data. Inherent to the method is the expression of the data in each row or each column relative to their respective totals, and it is these sets of relative values (called profiles) that are visualized. This ‘relativization’ of the data makes perfect sense when the margins of the table represent samples from sub-populations of inherently different sizes. But in some ecological applications sampling is performed on equal areas or equal volumes so that the absolute levels of the observed occurrences may be of relevance, in which case relativization may not be required. In this paper we define the correspondence analysis of the raw ‘unrelativized’ data and discuss its properties, comparing this new method to regular correspondence analysis and to a related variant of non-symmetric correspondence analysis.Abundance data, biplot, Bray-Curtis dissimilarity, profile, size and shape, visualisation

    Multiple Correspondence Analysis & the Multilogit Bilinear Model

    Full text link
    Multiple Correspondence Analysis (MCA) is a dimension reduction method which plays a large role in the analysis of tables with categorical nominal variables such as survey data. Though it is usually motivated and derived using geometric considerations, in fact we prove that it amounts to a single proximal Newtown step of a natural bilinear exponential family model for categorical data the multinomial logit bilinear model. We compare and contrast the behavior of MCA with that of the model on simulations and discuss new insights on the properties of both exploratory multivariate methods and their cognate models. One main conclusion is that we could recommend to approximate the multilogit model parameters using MCA. Indeed, estimating the parameters of the model is not a trivial task whereas MCA has the great advantage of being easily solved by singular value decomposition and scalable to large data
    • 

    corecore