Natural language processing (NLP) sees rich mobile applications. To support
various language understanding tasks, a foundation NLP model is often
fine-tuned in a federated, privacy-preserving setting (FL). This process
currently relies on at least hundreds of thousands of labeled training samples
from mobile clients; yet mobile users often lack willingness or knowledge to
label their data. Such an inadequacy of data labels is known as a few-shot
scenario; it becomes the key blocker for mobile NLP applications.
For the first time, this work investigates federated NLP in the few-shot
scenario (FedFSL). By retrofitting algorithmic advances of pseudo labeling and
prompt learning, we first establish a training pipeline that delivers
competitive accuracy when only 0.05% (fewer than 100) of the training data is
labeled and the remaining is unlabeled. To instantiate the workflow, we further
present a system FFNLP, addressing the high execution cost with novel designs.
(1) Curriculum pacing, which injects pseudo labels to the training workflow at
a rate commensurate to the learning progress; (2) Representational diversity, a
mechanism for selecting the most learnable data, only for which pseudo labels
will be generated; (3) Co-planning of a model's training depth and layer
capacity. Together, these designs reduce the training delay, client energy, and
network traffic by up to 46.0×, 41.2× and 3000.0×,
respectively. Through algorithm/system co-design, FFNLP demonstrates that FL
can apply to challenging settings where most training samples are unlabeled