232,330 research outputs found

    Compositional data analysis of geological variability and process : a case study

    Get PDF
    Developments in the statistical analysis of compositional data over the last two decades have made possible a much deeper exploration of the nature of variability and the possible processes associated with compositional data sets from many disciplines. In this paper, we concentrate on geochemical data. First, we explain how hypotheses of compositional variability may be formulated within the natural sample space, the unit simplex, including useful hypotheses of sub-compositional discrimination and specific perturbational change. Then we develop through standard methodology, such as generalised likelihood ratio tests, statistical tools to allow the systematic investigation of a lattice of such hypotheses. Some of these tests are simple adaptations of existing multivariate tests but others require special construction. We comment on the use of graphical methods in compositional data analysis and on the ordination of specimens. The recent development of the concept of compositional processes is then explained, together with the necessary tools for a staying-in-the-simplex approach, such as the singular value decomposition of a compositional data set. All these statistical techniques are illustrated for a substantial compositional data set, consisting of 209 major oxide and trace element compositions of metamorphosed limestones from the Grampian Highlands of Scotland. Finally, we discuss some unresolved problems in the statistical analysis of compositional processes

    Means and covariance functions for geostatistical compositional data: an axiomatic approach

    Full text link
    This work focuses on the characterization of the central tendency of a sample of compositional data. It provides new results about theoretical properties of means and covariance functions for compositional data, with an axiomatic perspective. Original results that shed new light on the geostatistical modeling of compositional data are presented. As a first result, it is shown that the weighted arithmetic mean is the only central tendency characteristic satisfying a small set of axioms, namely continuity, reflexivity and marginal stability. Moreover, this set of axioms also implies that the weights must be identical for all parts of the composition. This result has deep consequences on the spatial multivariate covariance modeling of compositional data. In a geostatistical setting, it is shown as a second result that the proportional model of covariance functions (i.e., the product of a covariance matrix and a single correlation function) is the only model that provides identical kriging weights for all components of the compositional data. As a consequence of these two results, the proportional model of covariance function is the only covariance model compatible with reflexivity and marginal stability

    Thurstonian Scaling of Compositional Questionnaire Data

    Get PDF
    To prevent response biases, personality questionnaires may use comparative response formats. These include forced choice, where respondents choose among a number of items, and quantitative comparisons, where respondents indicate the extent to which items are preferred to each other. The present article extends Thurstonian modeling of binary choice data (Brown & Maydeu-Olivares, 2011a) to “proportion-of-total” (compositional) formats. Following Aitchison (1982), compositional item data are transformed into log-ratios, conceptualized as differences of latent item utilities. The mean and covariance structure of the log-ratios is modelled using Confirmatory Factor Analysis (CFA), where the item utilities are first-order factors, and personal attributes measured by a questionnaire are second-order factors. A simulation study with two sample sizes, N=300 and N=1000, shows that the method provides very good recovery of true parameters and near-nominal rejection rates. The approach is illustrated with empirical data from N=317 students, comparing model parameters obtained with compositional and Likert scale versions of a Big Five measure. The results show that the proposed model successfully captures the latent structures and person scores on the measured traits

    Improved classification for compositional data using the α\alpha-transformation

    Get PDF
    In compositional data analysis an observation is a vector containing non-negative values, only the relative sizes of which are considered to be of interest. Without loss of generality, a compositional vector can be taken to be a vector of proportions that sum to one. Data of this type arise in many areas including geology, archaeology, biology, economics and political science. In this paper we investigate methods for classification of compositional data. Our approach centres on the idea of using the α\alpha-transformation to transform the data and then to classify the transformed data via regularised discriminant analysis and the k-nearest neighbours algorithm. Using the α\alpha-transformation generalises two rival approaches in compositional data analysis, one (when α=1\alpha=1) that treats the data as though they were Euclidean, ignoring the compositional constraint, and another (when α=0\alpha=0) that employs Aitchison's centred log-ratio transformation. A numerical study with several real datasets shows that whether using α=1\alpha=1 or α=0\alpha=0 gives better classification performance depends on the dataset, and moreover that using an intermediate value of α\alpha can sometimes give better performance than using either 1 or 0.Comment: This is a 17-page preprint and has been accepted for publication at the Journal of Classificatio

    The k-NN algorithm for compositional data: a revised approach with and without zero values present

    Get PDF
    In compositional data, an observation is a vector with non-negative components which sum to a constant, typically 1. Data of this type arise in many areas, such as geology, archaeology, biology, economics and political science among others. The goal of this paper is to extend the taxicab metric and a newly suggested metric for compositional data by employing a power transformation. Both metrics are to be used in the k-nearest neighbours algorithm regardless of the presence of zeros. Examples with real data are exhibited.Comment: This manuscript will appear at the. http://www.jds-online.com/volume-12-number-3-july-201

    Mapping Vesta: First Results from Dawn’s Survey Orbit

    Get PDF
    The geologic objectives of the Dawn Mission [1] are to derive Vesta’s shape, map the surface geology, understand the geological context and contribute to the determination of the asteroids’ origin and evolution.Geomorphology and distribution of surface features will provide evidence for impact cratering, tectonic activity, volcanism, and regolith processes. Spectral measurements of the surface will provide evidence of the compositional characteristics of geological units. Age information, as derived from crater sizefrequency distributions, provides the stratigraphic context for the structural and compositional mapping results, thus revealing the geologic history of Vesta. We present here the first results of the Dawn mission from data collected during the approach to Vesta, and its first discrete orbit phase – the Survey Orbit, which lasts 21 days after the spacecraft had established a circular polar orbit at a radius of ~3000 km with a beta angle of 10°-15°

    A Graph Theoretic Approach for Object Shape Representation in Compositional Hierarchies Using a Hybrid Generative-Descriptive Model

    Full text link
    A graph theoretic approach is proposed for object shape representation in a hierarchical compositional architecture called Compositional Hierarchy of Parts (CHOP). In the proposed approach, vocabulary learning is performed using a hybrid generative-descriptive model. First, statistical relationships between parts are learned using a Minimum Conditional Entropy Clustering algorithm. Then, selection of descriptive parts is defined as a frequent subgraph discovery problem, and solved using a Minimum Description Length (MDL) principle. Finally, part compositions are constructed by compressing the internal data representation with discovered substructures. Shape representation and computational complexity properties of the proposed approach and algorithms are examined using six benchmark two-dimensional shape image datasets. Experiments show that CHOP can employ part shareability and indexing mechanisms for fast inference of part compositions using learned shape vocabularies. Additionally, CHOP provides better shape retrieval performance than the state-of-the-art shape retrieval methods.Comment: Paper : 17 pages. 13th European Conference on Computer Vision (ECCV 2014), Zurich, Switzerland, September 6-12, 2014, Proceedings, Part III, pp 566-581. Supplementary material can be downloaded from http://link.springer.com/content/esm/chp:10.1007/978-3-319-10578-9_37/file/MediaObjects/978-3-319-10578-9_37_MOESM1_ESM.pd

    Compositional descriptor-based recommender system accelerating the materials discovery

    Full text link
    Structures and properties of many inorganic compounds have been collected historically. However, it only covers a very small portion of possible inorganic crystals, which implies the presence of numerous currently unknown compounds. A powerful machine-learning strategy is mandatory to discover new inorganic compounds from all chemical combinations. Herein we propose a descriptor-based recommender-system approach to estimate the relevance of chemical compositions where stable crystals can be formed [i.e., chemically relevant compositions (CRCs)]. As well as data-driven compositional similarity used in the literature, the use of compositional descriptors as a prior knowledge can accelerate the discovery of new compounds. We validate our recommender systems in two ways. Firstly, one database is used to construct a model, while another is used for the validation. Secondly, we estimate the phase stability for compounds at expected CRCs using density functional theory calculations.Comment: 8 pages, 7 figure
    corecore