From fat droplets to floating forests: cross-domain transfer learning
  using a PatchGAN-based segmentation model

Byrnes, Jarrett E. K.; Cavanaugh, Kyle; Christensen, Trace; Fortson, Lucy; Houskeeper, Henry; Mantha, Kameswara Bharadwaj; Mashek, Douglas; Pengo, Thomas; Rosenthal, Isaac; Salisbury, Jeffrey; Sanders, Mark; Sankar, Ramanakumar; Trouille, Laura; Zheng, Yuping

From fat droplets to floating forests: cross-domain transfer learning using a PatchGAN-based segmentation model

Authors: Jarrett E. K. Byrnes
Kyle Cavanaugh
Trace Christensen
Lucy Fortson
Henry Houskeeper
Kameswara Bharadwaj Mantha
Douglas Mashek
Thomas Pengo
Isaac Rosenthal
Jeffrey Salisbury
Mark Sanders
Ramanakumar Sankar
Laura Trouille
Yuping Zheng
Publication date: 7 November 2022
Publisher

Abstract

Many scientific domains gather sufficient labels to train machine algorithms through human-in-the-loop techniques provided by the Zooniverse.org citizen science platform. As the range of projects, task types and data rates increase, acceleration of model training is of paramount concern to focus volunteer effort where most needed. The application of Transfer Learning (TL) between Zooniverse projects holds promise as a solution. However, understanding the effectiveness of TL approaches that pretrain on large-scale generic image sets vs. images with similar characteristics possibly from similar tasks is an open challenge. We apply a generative segmentation model on two Zooniverse project-based data sets: (1) to identify fat droplets in liver cells (FatChecker; FC) and (2) the identification of kelp beds in satellite images (Floating Forests; FF) through transfer learning from the first project. We compare and contrast its performance with a TL model based on the COCO image set, and subsequently with baseline counterparts. We find that both the FC and COCO TL models perform better than the baseline cases when using >75% of the original training sample size. The COCO-based TL model generally performs better than the FC-based one, likely due to its generalized features. Our investigations provide important insights into usage of TL approaches on multi-domain data hosted across different Zooniverse projects, enabling future projects to accelerate task completion.Comment: 5 pages, 4 figures, accepted for publication at the Proceedings of the ACM/CIKM 2022 (Human-in-the-loop Data Curation Workshop

Similar works

Full text

Available Versions

arXiv.org e-Print Archive

oai:arXiv.org:2211.03937

Last time updated on 12/12/2022