Sample data processing in an additive and reproducible taxonomic workflow by
using character data persistently linked to preserved individual specimens
We present the model and implementation of a workflow that blazes a trail in
systematic biology for the re-usability of character data (data on any kind of
characters of pheno- and genotypes of organisms) and their additivity from
specimen to taxon level. We take into account that any taxon characterization
is based on a limited set of sampled individuals and characters, and that
consequently any new individual and any new character may affect the
recognition of biological entities and/or the subsequent delimitation and
characterization of a taxon. Taxon concepts thus frequently change during the
knowledge generation process in systematic biology. Structured character data
are therefore not only needed for the knowledge generation process but also
for easily adapting characterizations of taxa. We aim to facilitate the
construction and reproducibility of taxon characterizations from structured
character data of changing sample sets by establishing a stable and
unambiguous association between each sampled individual and the data processed
from it. Our workflow implementation uses the European Distributed Institute
of Taxonomy Platform, a comprehensive taxonomic data management and
publication environment to: (i) establish a reproducible connection between
sampled individuals and all samples derived from them; (ii) stably link
sample-based character data with the metadata of the respective samples; (iii)
record and store structured specimen-based character data in formats allowing
data exchange; (iv) reversibly assign sample metadata and character datasets
to taxa in an editable classification and display them and (v) organize data
exchange via standard exchange formats and enable the link between the
character datasets and samples in research collections, ensuring high
visibility and instant re-usability of the data. The workflow implemented will
contribute to organizing the interface between phylogenetic analysis and
revisionary taxonomic or monographic work