232,330 research outputs found
Compositional data analysis of geological variability and process : a case study
Developments in the statistical analysis of compositional data over the last two decades have made possible a much deeper exploration of the nature of variability and the possible processes associated with compositional data sets from many disciplines. In this paper, we concentrate on geochemical data. First, we explain how hypotheses of compositional variability may be formulated within the natural sample space, the unit simplex, including useful hypotheses of sub-compositional discrimination and specific perturbational change. Then we develop through standard methodology, such as generalised likelihood ratio tests, statistical tools to allow the systematic investigation of a lattice of such hypotheses. Some of these tests are simple adaptations of existing multivariate tests but others require special construction. We comment on the use of graphical methods in compositional data analysis and on the ordination of specimens. The recent development of the concept of compositional processes is then explained, together with the necessary tools for a staying-in-the-simplex approach, such as the singular value decomposition of a compositional data set. All these statistical techniques are illustrated for a substantial compositional data set, consisting of 209 major oxide and trace element compositions of metamorphosed limestones from the Grampian Highlands of Scotland. Finally, we discuss some unresolved problems in the statistical analysis of compositional processes
Means and covariance functions for geostatistical compositional data: an axiomatic approach
This work focuses on the characterization of the central tendency of a sample
of compositional data. It provides new results about theoretical properties of
means and covariance functions for compositional data, with an axiomatic
perspective. Original results that shed new light on the geostatistical
modeling of compositional data are presented. As a first result, it is shown
that the weighted arithmetic mean is the only central tendency characteristic
satisfying a small set of axioms, namely continuity, reflexivity and marginal
stability. Moreover, this set of axioms also implies that the weights must be
identical for all parts of the composition. This result has deep consequences
on the spatial multivariate covariance modeling of compositional data. In a
geostatistical setting, it is shown as a second result that the proportional
model of covariance functions (i.e., the product of a covariance matrix and a
single correlation function) is the only model that provides identical kriging
weights for all components of the compositional data. As a consequence of these
two results, the proportional model of covariance function is the only
covariance model compatible with reflexivity and marginal stability
Thurstonian Scaling of Compositional Questionnaire Data
To prevent response biases, personality questionnaires may use comparative response formats. These include forced choice, where respondents choose among a number of items, and quantitative comparisons, where respondents indicate the extent to which items are preferred to each other. The present article extends Thurstonian modeling of binary choice data (Brown & Maydeu-Olivares, 2011a) to “proportion-of-total” (compositional) formats. Following Aitchison (1982), compositional item data are transformed into log-ratios, conceptualized as differences of latent item utilities. The mean and covariance structure of the log-ratios is modelled using Confirmatory Factor Analysis (CFA), where the item utilities are first-order factors, and personal attributes measured by a questionnaire are second-order factors. A simulation study with two sample sizes, N=300 and N=1000, shows that the method provides very good recovery of true parameters and near-nominal rejection rates. The approach is illustrated with empirical data from N=317 students, comparing model parameters obtained with compositional and Likert scale versions of a Big Five measure. The results show that the proposed model successfully captures the latent structures and person scores on the measured traits
Improved classification for compositional data using the -transformation
In compositional data analysis an observation is a vector containing
non-negative values, only the relative sizes of which are considered to be of
interest. Without loss of generality, a compositional vector can be taken to be
a vector of proportions that sum to one. Data of this type arise in many areas
including geology, archaeology, biology, economics and political science. In
this paper we investigate methods for classification of compositional data. Our
approach centres on the idea of using the -transformation to transform
the data and then to classify the transformed data via regularised discriminant
analysis and the k-nearest neighbours algorithm. Using the
-transformation generalises two rival approaches in compositional data
analysis, one (when ) that treats the data as though they were
Euclidean, ignoring the compositional constraint, and another (when )
that employs Aitchison's centred log-ratio transformation. A numerical study
with several real datasets shows that whether using or
gives better classification performance depends on the dataset, and moreover
that using an intermediate value of can sometimes give better
performance than using either 1 or 0.Comment: This is a 17-page preprint and has been accepted for publication at
the Journal of Classificatio
The k-NN algorithm for compositional data: a revised approach with and without zero values present
In compositional data, an observation is a vector with non-negative
components which sum to a constant, typically 1. Data of this type arise in
many areas, such as geology, archaeology, biology, economics and political
science among others. The goal of this paper is to extend the taxicab metric
and a newly suggested metric for compositional data by employing a power
transformation. Both metrics are to be used in the k-nearest neighbours
algorithm regardless of the presence of zeros. Examples with real data are
exhibited.Comment: This manuscript will appear at the.
http://www.jds-online.com/volume-12-number-3-july-201
Mapping Vesta: First Results from Dawn’s Survey Orbit
The geologic objectives of the Dawn Mission [1] are
to derive Vesta’s shape, map the surface geology,
understand the geological context and contribute to
the determination of the asteroids’ origin and
evolution.Geomorphology and distribution of surface features
will provide evidence for impact cratering, tectonic activity, volcanism, and regolith processes. Spectral
measurements of the surface will provide evidence of
the compositional characteristics of geological units.
Age information, as derived from crater sizefrequency
distributions, provides the stratigraphic
context for the structural and compositional mapping
results, thus revealing the geologic history of Vesta.
We present here the first results of the Dawn mission
from data collected during the approach to Vesta, and
its first discrete orbit phase – the Survey Orbit, which
lasts 21 days after the spacecraft had established a
circular polar orbit at a radius of ~3000 km with a
beta angle of 10°-15°
A Graph Theoretic Approach for Object Shape Representation in Compositional Hierarchies Using a Hybrid Generative-Descriptive Model
A graph theoretic approach is proposed for object shape representation in a
hierarchical compositional architecture called Compositional Hierarchy of Parts
(CHOP). In the proposed approach, vocabulary learning is performed using a
hybrid generative-descriptive model. First, statistical relationships between
parts are learned using a Minimum Conditional Entropy Clustering algorithm.
Then, selection of descriptive parts is defined as a frequent subgraph
discovery problem, and solved using a Minimum Description Length (MDL)
principle. Finally, part compositions are constructed by compressing the
internal data representation with discovered substructures. Shape
representation and computational complexity properties of the proposed approach
and algorithms are examined using six benchmark two-dimensional shape image
datasets. Experiments show that CHOP can employ part shareability and indexing
mechanisms for fast inference of part compositions using learned shape
vocabularies. Additionally, CHOP provides better shape retrieval performance
than the state-of-the-art shape retrieval methods.Comment: Paper : 17 pages. 13th European Conference on Computer Vision (ECCV
2014), Zurich, Switzerland, September 6-12, 2014, Proceedings, Part III, pp
566-581. Supplementary material can be downloaded from
http://link.springer.com/content/esm/chp:10.1007/978-3-319-10578-9_37/file/MediaObjects/978-3-319-10578-9_37_MOESM1_ESM.pd
Compositional descriptor-based recommender system accelerating the materials discovery
Structures and properties of many inorganic compounds have been collected
historically. However, it only covers a very small portion of possible
inorganic crystals, which implies the presence of numerous currently unknown
compounds. A powerful machine-learning strategy is mandatory to discover new
inorganic compounds from all chemical combinations. Herein we propose a
descriptor-based recommender-system approach to estimate the relevance of
chemical compositions where stable crystals can be formed [i.e., chemically
relevant compositions (CRCs)]. As well as data-driven compositional similarity
used in the literature, the use of compositional descriptors as a prior
knowledge can accelerate the discovery of new compounds. We validate our
recommender systems in two ways. Firstly, one database is used to construct a
model, while another is used for the validation. Secondly, we estimate the
phase stability for compounds at expected CRCs using density functional theory
calculations.Comment: 8 pages, 7 figure
- …
