147 research outputs found
Grouped Knowledge Distillation for Deep Face Recognition
Compared with the feature-based distillation methods, logits distillation can
liberalize the requirements of consistent feature dimension between teacher and
student networks, while the performance is deemed inferior in face recognition.
One major challenge is that the light-weight student network has difficulty
fitting the target logits due to its low model capacity, which is attributed to
the significant number of identities in face recognition. Therefore, we seek to
probe the target logits to extract the primary knowledge related to face
identity, and discard the others, to make the distillation more achievable for
the student network. Specifically, there is a tail group with near-zero values
in the prediction, containing minor knowledge for distillation. To provide a
clear perspective of its impact, we first partition the logits into two groups,
i.e., Primary Group and Secondary Group, according to the cumulative
probability of the softened prediction. Then, we reorganize the Knowledge
Distillation (KD) loss of grouped logits into three parts, i.e., Primary-KD,
Secondary-KD, and Binary-KD. Primary-KD refers to distilling the primary
knowledge from the teacher, Secondary-KD aims to refine minor knowledge but
increases the difficulty of distillation, and Binary-KD ensures the consistency
of knowledge distribution between teacher and student. We experimentally found
that (1) Primary-KD and Binary-KD are indispensable for KD, and (2)
Secondary-KD is the culprit restricting KD at the bottleneck. Therefore, we
propose a Grouped Knowledge Distillation (GKD) that retains the Primary-KD and
Binary-KD but omits Secondary-KD in the ultimate KD loss calculation. Extensive
experimental results on popular face recognition benchmarks demonstrate the
superiority of proposed GKD over state-of-the-art methods.Comment: 9 pages, 2 figures, 7 tables, accepted by AAAI 202
Recommended from our members
Notch2 controls hepatocyte-derived cholangiocarcinoma formation in mice.
Liver cancer comprises a group of malignant tumors, among which hepatocellular carcinoma (HCC) and intrahepatic cholangiocarcinoma (ICC) are the most common. ICC is especially pernicious and associated with poor clinical outcome. Studies have shown that a subset of human ICCs may originate from mature hepatocytes. However, the mechanisms driving the trans-differentiation of hepatocytes into malignant cholangiocytes remain poorly defined. We adopted lineage tracing techniques and an established murine hepatocyte-derived ICC model by hydrodynamic injection of activated forms of AKT (myr-AKT) and Yap (YapS127A) proto-oncogenes. Wild-type, Notch1 flox/flox , and Notch2 flox/flox mice were used to investigate the role of canonical Notch signaling and Notch receptors in AKT/Yap-driven ICC formation. Human ICC and HCC cell lines were transfected with siRNA against Notch2 to determine whether Notch2 regulates biliary marker expression in liver tumor cells. We found that AKT/Yap-induced ICC formation is hepatocyte derived and this process is strictly dependent on the canonical Notch signaling pathway in vivo. Deletion of Notch2 in AKT/Yap-induced tumors switched the phenotype from ICC to hepatocellular adenoma-like lesions, while inactivation of Notch1 in hepatocytes did not result in significant histomorphological changes. Finally, in vitro studies revealed that Notch2 silencing in ICC and HCC cell lines down-regulates the expression of Sox9 and EpCAM biliary markers. Notch2 is the major determinant of hepatocyte-derived ICC formation in mice
Auto-regressive Image Synthesis with Integrated Quantization
Deep generative models have achieved conspicuous progress in realistic image
synthesis with multifarious conditional inputs, while generating diverse yet
high-fidelity images remains a grand challenge in conditional image generation.
This paper presents a versatile framework for conditional image generation
which incorporates the inductive bias of CNNs and powerful sequence modeling of
auto-regression that naturally leads to diverse image generation. Instead of
independently quantizing the features of multiple domains as in prior research,
we design an integrated quantization scheme with a variational regularizer that
mingles the feature discretization in multiple domains, and markedly boosts the
auto-regressive modeling performance. Notably, the variational regularizer
enables to regularize feature distributions in incomparable latent spaces by
penalizing the intra-domain variations of distributions. In addition, we design
a Gumbel sampling strategy that allows to incorporate distribution uncertainty
into the auto-regressive training procedure. The Gumbel sampling substantially
mitigates the exposure bias that often incurs misalignment between the training
and inference stages and severely impairs the inference performance. Extensive
experiments over multiple conditional image generation tasks show that our
method achieves superior diverse image generation performance qualitatively and
quantitatively as compared with the state-of-the-art.Comment: Accepted to ECCV 2022 as Oral Presentatio
- …