225 research outputs found
Multi-Label Zero-Shot Learning with Structured Knowledge Graphs
In this paper, we propose a novel deep learning architecture for multi-label
zero-shot learning (ML-ZSL), which is able to predict multiple unseen class
labels for each input instance. Inspired by the way humans utilize semantic
knowledge between objects of interests, we propose a framework that
incorporates knowledge graphs for describing the relationships between multiple
labels. Our model learns an information propagation mechanism from the semantic
label space, which can be applied to model the interdependencies between seen
and unseen class labels. With such investigation of structured knowledge graphs
for visual reasoning, we show that our model can be applied for solving
multi-label classification and ML-ZSL tasks. Compared to state-of-the-art
approaches, comparable or improved performances can be achieved by our method.Comment: CVPR 201
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
Federated Learning (FL) offers a collaborative training framework, allowing
multiple clients to contribute to a shared model without compromising data
privacy. Due to the heterogeneous nature of local datasets, updated client
models may overfit and diverge from one another, commonly known as the problem
of client drift. In this paper, we propose FedBug (Federated Learning with
Bottom-Up Gradual Unfreezing), a novel FL framework designed to effectively
mitigate client drift. FedBug adaptively leverages the client model parameters,
distributed by the server at each global round, as the reference points for
cross-client alignment. Specifically, on the client side, FedBug begins by
freezing the entire model, then gradually unfreezes the layers, from the input
layer to the output layer. This bottom-up approach allows models to train the
newly thawed layers to project data into a latent space, wherein the separating
hyperplanes remain consistent across all clients. We theoretically analyze
FedBug in a novel over-parameterization FL setup, revealing its superior
convergence rate compared to FedAvg. Through comprehensive experiments,
spanning various datasets, training conditions, and network architectures, we
validate the efficacy of FedBug. Our contributions encompass a novel FL
framework, theoretical analysis, and empirical validation, demonstrating the
wide potential and applicability of FedBug.Comment: Submitted to NeurIPS'2
Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation
While representation learning aims to derive interpretable features for
describing visual data, representation disentanglement further results in such
features so that particular image attributes can be identified and manipulated.
However, one cannot easily address this task without observing ground truth
annotation for the training data. To address this problem, we propose a novel
deep learning model of Cross-Domain Representation Disentangler (CDRD). By
observing fully annotated source-domain data and unlabeled target-domain data
of interest, our model bridges the information across data domains and
transfers the attribute information accordingly. Thus, cross-domain joint
feature disentanglement and adaptation can be jointly performed. In the
experiments, we provide qualitative results to verify our disentanglement
capability. Moreover, we further confirm that our model can be applied for
solving classification tasks of unsupervised domain adaptation, and performs
favorably against state-of-the-art image disentanglement and translation
methods.Comment: CVPR 2018 Spotligh
Order-Free RNN with Visual Attention for Multi-Label Classification
In this paper, we propose the joint learning attention and recurrent neural
network (RNN) models for multi-label classification. While approaches based on
the use of either model exist (e.g., for the task of image captioning),
training such existing network architectures typically require pre-defined
label sequences. For multi-label classification, it would be desirable to have
a robust inference process, so that the prediction error would not propagate
and thus affect the performance. Our proposed model uniquely integrates
attention and Long Short Term Memory (LSTM) models, which not only addresses
the above problem but also allows one to identify visual objects of interests
with varying sizes without the prior knowledge of particular label ordering.
More importantly, label co-occurrence information can be jointly exploited by
our LSTM model. Finally, by advancing the technique of beam search, prediction
of multiple labels can be efficiently achieved by our proposed network model.Comment: Accepted at 32nd AAAI Conference on Artificial Intelligence (AAAI-18
Learning Deep Latent Spaces for Multi-Label Classification
Multi-label classification is a practical yet challenging task in machine
learning related fields, since it requires the prediction of more than one
label category for each input instance. We propose a novel deep neural networks
(DNN) based model, Canonical Correlated AutoEncoder (C2AE), for solving this
task. Aiming at better relating feature and label domain data for improved
classification, we uniquely perform joint feature and label embedding by
deriving a deep latent space, followed by the introduction of label-correlation
sensitive loss function for recovering the predicted label outputs. Our C2AE is
achieved by integrating the DNN architectures of canonical correlation analysis
and autoencoder, which allows end-to-end learning and prediction with the
ability to exploit label dependency. Moreover, our C2AE can be easily extended
to address the learning problem with missing labels. Our experiments on
multiple datasets with different scales confirm the effectiveness and
robustness of our proposed method, which is shown to perform favorably against
state-of-the-art methods for multi-label classification.Comment: published in AAAI-201
- …