4 research outputs found
Deep Visual-Genetic Biometrics for Taxonomic Classification of Rare Species
Visual as well as genetic biometrics are routinely employed to identify species and individuals in biological applications. However, no attempts have been made in this domain to computationally enhance visual classification of rare classes with little image data via genetics. In this paper, we thus propose aligned visual-genetic learning as a new application domain with the aim to implicitly encode cross-modality associations for improved performance. We demonstrate for the first time that such alignment can be achieved via deep embedding models and that the approach is directly applicable to boosting long-tailed recognition (LTR), particularly for rare species. We experimentally demonstrate the efficacy of the concept via application to microscopic imagery of 30k+ planktic foraminifer shells across 32 species when used together with independent genetic data samples. Most importantly for practitioners, we show that visual-genetic alignment can significantly benefit visual-only recognition of the rarest species. Technically, we pre-train a visual ResNet50 deep learning model using triplet loss formulations to create an initial embedding space. We re-structure this space based on genetic anchors embedded via a Sequence Graph Transform (SGT) and linked to visual data by cross-domain cosine alignment. We show that an LTR approach improves the state-of-the-art across all benchmarks and that adding our visual-genetic alignment improves per-class and particularly rare tail class benchmarks significantly further. Overall, visual-genetic LTR training raises rare per-class accuracy from 37.4% to benchmark-beating 59.7%. We conclude that visual-genetic alignment can be a highly effective tool for complementing visual biological data containing rare classes. The concept proposed may serve as an important future tool for integrating genetics and imageomics towards a more complete scientific representation of taxonomic spaces and life itself. Code, weights, and data splits are published for full reproducibility
Deep Visual-Genetic Biometrics for Taxonomic Classification of Rare Species
Visual as well as genetic biometrics are routinely employed to identify
species and individuals in biological applications. However, no attempts have
been made in this domain to computationally enhance visual classification of
rare classes with little image data via genetics. In this paper, we thus
propose aligned visual-genetic inference spaces with the aim to implicitly
encode cross-domain associations for improved performance. We demonstrate for
the first time that such alignment can be achieved via deep embedding models
and that the approach is directly applicable to boosting long-tailed
recognition (LTR) particularly for rare species. We experimentally demonstrate
the efficacy of the concept via application to microscopic imagery of 30k+
planktic foraminifer shells across 32 species when used together with
independent genetic data samples. Most importantly for practitioners, we show
that visual-genetic alignment can significantly benefit visual-only recognition
of the rarest species. Technically, we pre-train a visual ResNet50 deep
learning model using triplet loss formulations to create an initial embedding
space. We re-structure this space based on genetic anchors embedded via a
Sequence Graph Transform (SGT) and linked to visual data by cross-domain cosine
alignment. We show that an LTR approach improves the state-of-the-art across
all benchmarks and that adding our visual-genetic alignment improves per-class
and particularly rare tail class benchmarks significantly further. We conclude
that visual-genetic alignment can be a highly effective tool for complementing
visual biological data containing rare classes. The concept proposed may serve
as an important future tool for integrating genetics and imageomics towards a
more complete scientific representation of taxonomic spaces and life itself.
Code, weights, and data splits are published for full reproducibility
Visual Microfossil Identification via Deep Metric Learning
We apply deep metric learning for the first time to the problem of
classifying planktic foraminifer shells on microscopic images. This species
recognition task is an important information source and scientific pillar for
reconstructing past climates. All foraminifer CNN recognition pipelines in the
literature produce black-box classifiers that lack visualization options for
human experts and cannot be applied to open-set problems. Here, we benchmark
metric learning against these pipelines, produce the first scientific
visualization of the phenotypic planktic foraminifer morphology space, and
demonstrate that metric learning can be used to cluster species unseen during
training. We show that metric learning outperforms all published CNN-based
state-of-the-art benchmarks in this domain. We evaluate our approach on the
34,640 expert-annotated images of the Endless Forams public library of 35
modern planktic foraminifera species. Our results on this data show leading 92%
accuracy (at 0.84 F1-score) in reproducing expert labels on withheld test data,
and 66.5% accuracy (at 0.70 F1-score) when clustering species never encountered
in training. We conclude that metric learning is highly effective for this
domain and serves as an important tool towards expert-in-the-loop automation of
microfossil identification. Keycode, network weights, and data splits are
published with this paper for full reproducibility