109 research outputs found
Achieving Lightweight Federated Advertising with Self-Supervised Split Distillation
As an emerging secure learning paradigm in leveraging cross-agency private
data, vertical federated learning (VFL) is expected to improve advertising
models by enabling the joint learning of complementary user attributes
privately owned by the advertiser and the publisher. However, there are two key
challenges in applying it to advertising systems: a) the limited scale of
labeled overlapping samples, and b) the high cost of real-time cross-agency
serving.
In this paper, we propose a semi-supervised split distillation framework
VFed-SSD to alleviate the two limitations. We identify that: i) there are
massive unlabeled overlapped data available in advertising systems, and ii) we
can keep a balance between model performance and inference cost by decomposing
the federated model. Specifically, we develop a self-supervised task Matched
Pair Detection (MPD) to exploit the vertically partitioned unlabeled data and
propose the Split Knowledge Distillation (SplitKD) schema to avoid cross-agency
serving.
Empirical studies on three industrial datasets exhibit the effectiveness of
our methods, with the median AUC over all datasets improved by 0.86% and 2.6%
in the local deployment mode and the federated deployment mode respectively.
Overall, our framework provides an efficient federation-enhanced solution for
real-time display advertising with minimal deploying cost and significant
performance lift.Comment: Accepted to the Trustworthy Federated Learning workshop of IJCAI2022
(FL-IJCAI22). 6 pages, 3 figures, 3 tables Old title: Semi-Supervised
Cross-Silo Advertising with Partial Knowledge Transfe
DomainAdaptor: A Novel Approach to Test-time Adaptation
To deal with the domain shift between training and test samples, current
methods have primarily focused on learning generalizable features during
training and ignore the specificity of unseen samples that are also critical
during the test. In this paper, we investigate a more challenging task that
aims to adapt a trained CNN model to unseen domains during the test. To
maximumly mine the information in the test data, we propose a unified method
called DomainAdaptor for the test-time adaptation, which consists of an
AdaMixBN module and a Generalized Entropy Minimization (GEM) loss.
Specifically, AdaMixBN addresses the domain shift by adaptively fusing training
and test statistics in the normalization layer via a dynamic mixture
coefficient and a statistic transformation operation. To further enhance the
adaptation ability of AdaMixBN, we design a GEM loss that extends the Entropy
Minimization loss to better exploit the information in the test data. Extensive
experiments show that DomainAdaptor consistently outperforms the
state-of-the-art methods on four benchmarks. Furthermore, our method brings
more remarkable improvement against existing methods on the few-data unseen
domain. The code is available at https://github.com/koncle/DomainAdaptor.Comment: Accepted by ICCV202
Scalable Recollections for Continual Lifelong Learning
Given the recent success of Deep Learning applied to a variety of single
tasks, it is natural to consider more human-realistic settings. Perhaps the
most difficult of these settings is that of continual lifelong learning, where
the model must learn online over a continuous stream of non-stationary data. A
successful continual lifelong learning system must have three key capabilities:
it must learn and adapt over time, it must not forget what it has learned, and
it must be efficient in both training time and memory. Recent techniques have
focused their efforts primarily on the first two capabilities while questions
of efficiency remain largely unexplored. In this paper, we consider the problem
of efficient and effective storage of experiences over very large time-frames.
In particular we consider the case where typical experiences are O(n) bits and
memories are limited to O(k) bits for k << n. We present a novel scalable
architecture and training algorithm in this challenging domain and provide an
extensive evaluation of its performance. Our results show that we can achieve
considerable gains on top of state-of-the-art methods such as GEM.Comment: AAAI 201
- …