Many scientific domains gather sufficient labels to train machine algorithms
through human-in-the-loop techniques provided by the Zooniverse.org citizen
science platform. As the range of projects, task types and data rates increase,
acceleration of model training is of paramount concern to focus volunteer
effort where most needed. The application of Transfer Learning (TL) between
Zooniverse projects holds promise as a solution. However, understanding the
effectiveness of TL approaches that pretrain on large-scale generic image sets
vs. images with similar characteristics possibly from similar tasks is an open
challenge. We apply a generative segmentation model on two Zooniverse
project-based data sets: (1) to identify fat droplets in liver cells
(FatChecker; FC) and (2) the identification of kelp beds in satellite images
(Floating Forests; FF) through transfer learning from the first project. We
compare and contrast its performance with a TL model based on the COCO image
set, and subsequently with baseline counterparts. We find that both the FC and
COCO TL models perform better than the baseline cases when using >75% of the
original training sample size. The COCO-based TL model generally performs
better than the FC-based one, likely due to its generalized features. Our
investigations provide important insights into usage of TL approaches on
multi-domain data hosted across different Zooniverse projects, enabling future
projects to accelerate task completion.Comment: 5 pages, 4 figures, accepted for publication at the Proceedings of
the ACM/CIKM 2022 (Human-in-the-loop Data Curation Workshop