49 research outputs found
Heterogeneous Federated Learning: State-of-the-art and Research Challenges
Federated learning (FL) has drawn increasing attention owing to its potential
use in large-scale industrial applications. Existing federated learning works
mainly focus on model homogeneous settings. However, practical federated
learning typically faces the heterogeneity of data distributions, model
architectures, network environments, and hardware devices among participant
clients. Heterogeneous Federated Learning (HFL) is much more challenging, and
corresponding solutions are diverse and complex. Therefore, a systematic survey
on this topic about the research challenges and state-of-the-art is essential.
In this survey, we firstly summarize the various research challenges in HFL
from five aspects: statistical heterogeneity, model heterogeneity,
communication heterogeneity, device heterogeneity, and additional challenges.
In addition, recent advances in HFL are reviewed and a new taxonomy of existing
HFL methods is proposed with an in-depth analysis of their pros and cons. We
classify existing methods from three different levels according to the HFL
procedure: data-level, model-level, and server-level. Finally, several critical
and promising future research directions in HFL are discussed, which may
facilitate further developments in this field. A periodically updated
collection on HFL is available at https://github.com/marswhu/HFL_Survey.Comment: 42 pages, 11 figures, and 4 table
TASA: Deceiving Question Answering Models by Twin Answer Sentences Attack
We present Twin Answer Sentences Attack (TASA), an adversarial attack method
for question answering (QA) models that produces fluent and grammatical
adversarial contexts while maintaining gold answers. Despite phenomenal
progress on general adversarial attacks, few works have investigated the
vulnerability and attack specifically for QA models. In this work, we first
explore the biases in the existing models and discover that they mainly rely on
keyword matching between the question and context, and ignore the relevant
contextual relations for answer prediction. Based on two biases above, TASA
attacks the target model in two folds: (1) lowering the model's confidence on
the gold answer with a perturbed answer sentence; (2) misguiding the model
towards a wrong answer with a distracting answer sentence. Equipped with
designed beam search and filtering methods, TASA can generate more effective
attacks than existing textual attack methods while sustaining the quality of
contexts, in extensive experiments on five QA datasets and human evaluations.Comment: Accepted by EMNLP 2022 (long), 9 pages main + 2 pages references + 7
pages appendi
Video Saliency Detection Using Object Proposals
In this paper, we introduce a novel approach to identify salient object regions in videos via object proposals. The core idea is to solve the saliency detection problem by ranking and selecting the salient proposals based on object-level saliency cues. Object proposals offer a more complete and high-level representation, which naturally caters to the needs of salient object detection. As well as introducing this novel solution for video salient object detection, we reorganize various discriminative saliency cues and traditional saliency assumptions on object proposals. With object candidates, a proposal ranking and voting scheme, based on various object-level saliency cues, is designed to screen out nonsalient parts, select salient object regions, and to infer an initial saliency estimate. Then a saliency optimization process that considers temporal consistency and appearance differences between salient and nonsalient regions is used to refine the initial saliency estimates. Our experiments on public datasets (SegTrackV2, Freiburg-Berkeley Motion Segmentation Dataset, and Densely Annotated Video Segmentation) validate the effectiveness, and the proposed method produces significant improvements over state-of-the-art algorithms
Dynamic Contrastive Distillation for Image-Text Retrieval
Although the vision-and-language pretraining (VLP) equipped cross-modal image-text retrieval (ITR) has achieved remarkable progress in the past two years, it suffers from a major drawback: the ever-increasing size of VLP models restrict its deployment to real-world search scenarios (where the high latency is unacceptable). To alleviate this problem, we present a novel plug-in dynamic contrastive distillation (DCD) framework to compress the large VLP models for the ITR task. Technically, we face the following two challenges: 1) the typical uni-modal metric learning approach is difficult to directly apply to cross-modal task, due to the limited GPU memory to optimize too many negative samples during handling cross-modal fusion features. 2) it is inefficient to static optimize the student network from different hard samples, which have different effects on distillation learning and student network optimization. We try to overcome these challenges from two points. First, to achieve multi-modal contrastive learning, and balance the training costs and effects, we propose to use a teacher network to estimate the difficult samples for students, making the students absorb the powerful knowledge from pre-trained teachers, and master the knowledge from hard samples. Second, to dynamic learn from hard sample pairs, we propose dynamic distillation to dynamically learn samples of different difficulties, from the perspective of better balancing the difficulty of knowledge and students' self-learning ability. We successfully apply our proposed DCD strategy on two state-of-the-art vision-language pretrained models, i.e. ViLT and METER. Extensive experiments on MS-COCO and Flickr 30 K benchmarks show the effectiveness and efficiency of our DCD framework. Encouragingly, we can speed up the inference at least 129 × compared to the existing ITR models. We further provide in-depth analyses and discussions that explain where the performance improvement comes from. We hope our work can shed light on other tasks that require distillation and contrastive learning
The surface morphology of crystals melting under solutions of different densities
Examples of solids melting beneath liquids are described for cases where the bulk liquid volume is stabilized against convection by a positive vertical temperature gradient, either with, or without local density inversion at the melting interface. The examples include ice melting beneath brine or methanol solutions and tin or lead melting under molten Sn-20w%Pb or Pb-20wt%Sn respectively. Without density inversion the melting is slow, purely diffusion controlled and the interfaces are smooth; with convection assisted melting the rate increases by some two orders of magnitude and the interfaces develop a rough profile - in the case of ice both irregular and quasi-steady state features are observed. The observations are discussed in terms of prevailing temperature and concentration gradients. © 1988