research

Making geological sense of 'Big Data' in sedimentary provenance

Abstract

Sedimentary provenance studies increasingly apply multiple chemical, mineralogical and isotopic proxies to many samples. The resulting datasets are often so large (containing thousands of numerical values) and complex (comprising multiple dimensions) that it is warranted to use the Internet-era term ‘Big Data’ to describe them. This paper introduces Multidimensional Scaling (MDS), Generalised Procrustes Analysis (GPA) and Individual Differences Scaling (INDSCAL, a type of ‘3-way MDS’ algorithm) as simple yet powerful tools to extract geological insights from ‘Big Data’ in a provenance context. Using a dataset from the Namib Sand Sea as a test case, we show how MDS can be used to visualise the similarities and differences between 16 fluvial and aeolian sand samples for five different provenance proxies, resulting in five different ‘configurations’. These configurations can be fed into a GPA algorithm, which translates, rotates, scales and reflects them to extract a ‘consensus view’ for all the data considered together. Alternatively, the five proxies can be jointly analysed by INDSCAL, which fits the data with not one but two sets of coordinates: the ‘group configuration’, which strongly resembles the graphical output produced by GPA, and the ‘source weights’, which can be used to attach geological meaning to the group configuration. For the Namib study, the three methods paint a detailed and self-consistent picture of a sediment routing system in which sand composition is determined by the combination of provenance and hydraulic sorting effects

    Similar works