503,718 research outputs found
FedMEKT: Distillation-based Embedding Knowledge Transfer for Multimodal Federated Learning
Federated learning (FL) enables a decentralized machine learning paradigm for
multiple clients to collaboratively train a generalized global model without
sharing their private data. Most existing works simply propose typical FL
systems for single-modal data, thus limiting its potential on exploiting
valuable multimodal data for future personalized applications. Furthermore, the
majority of FL approaches still rely on the labeled data at the client side,
which is limited in real-world applications due to the inability of
self-annotation from users. In light of these limitations, we propose a novel
multimodal FL framework that employs a semi-supervised learning approach to
leverage the representations from different modalities. Bringing this concept
into a system, we develop a distillation-based multimodal embedding knowledge
transfer mechanism, namely FedMEKT, which allows the server and clients to
exchange the joint knowledge of their learning models extracted from a small
multimodal proxy dataset. Our FedMEKT iteratively updates the generalized
global encoders with the joint embedding knowledge from the participating
clients. Thereby, to address the modality discrepancy and labeled data
constraint in existing FL systems, our proposed FedMEKT comprises local
multimodal autoencoder learning, generalized multimodal autoencoder
construction, and generalized classifier learning. Through extensive
experiments on three multimodal human activity recognition datasets, we
demonstrate that FedMEKT achieves superior global encoder performance on linear
evaluation and guarantees user privacy for personal data and model parameters
while demanding less communication cost than other baselines
Fundamental Asymptotic Behavior of (Two-User) Distributed Massive MIMO
This paper considers the uplink of a distributed Massive MIMO network where
base stations (BSs), each equipped with antennas, receive data from
users. We study the asymptotic spectral efficiency (as )
with spatial correlated channels, pilot contamination, and different degrees of
channel state information (CSI) and statistical knowledge at the BSs. By
considering a two-user setup, we can simply derive fundamental asymptotic
behaviors and provide novel insights into the structure of the optimal
combining schemes. In line with [1], when global CSI is available at all BSs,
the optimal minimum-mean squared error combining has an unbounded capacity as
, if the global channel covariance matrices of the users are
asymptotically linearly independent. This result is instrumental to derive a
suboptimal combining scheme that provides unbounded capacity as
using only local CSI and global channel statistics. The latter scheme is shown
to outperform a generalized matched filter scheme, which also achieves
asymptotic unbounded capacity by using only local CSI and global channel
statistics, but is derived following [2] on the basis of a more conservative
capacity bound.Comment: 6 pages, 2 figures, to be presented at GLOBECOM 2018, Abu Dhab
SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data
In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge.
In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator.
Reproducible R codes are available at
http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html
SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data
In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge.
In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator.
Reproducible R codes are available at
http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html
Modeling Multiple Views via Implicitly Preserving Global Consistency and Local Complementarity
While self-supervised learning techniques are often used to mining implicit
knowledge from unlabeled data via modeling multiple views, it is unclear how to
perform effective representation learning in a complex and inconsistent
context. To this end, we propose a methodology, specifically consistency and
complementarity network (CoCoNet), which avails of strict global inter-view
consistency and local cross-view complementarity preserving regularization to
comprehensively learn representations from multiple views. On the global stage,
we reckon that the crucial knowledge is implicitly shared among views, and
enhancing the encoder to capture such knowledge from data can improve the
discriminability of the learned representations. Hence, preserving the global
consistency of multiple views ensures the acquisition of common knowledge.
CoCoNet aligns the probabilistic distribution of views by utilizing an
efficient discrepancy metric measurement based on the generalized sliced
Wasserstein distance. Lastly on the local stage, we propose a heuristic
complementarity-factor, which joints cross-view discriminative knowledge, and
it guides the encoders to learn not only view-wise discriminability but also
cross-view complementary information. Theoretically, we provide the
information-theoretical-based analyses of our proposed CoCoNet. Empirically, to
investigate the improvement gains of our approach, we conduct adequate
experimental validations, which demonstrate that CoCoNet outperforms the
state-of-the-art self-supervised methods by a significant margin proves that
such implicit consistency and complementarity preserving regularization can
enhance the discriminability of latent representations.Comment: Accepted by IEEE Transactions on Knowledge and Data Engineering
(TKDE) 2022; Refer to https://ieeexplore.ieee.org/document/985763
ConDistFL: Conditional Distillation for Federated Learning from Partially Annotated Data
Developing a generalized segmentation model capable of simultaneously
delineating multiple organs and diseases is highly desirable. Federated
learning (FL) is a key technology enabling the collaborative development of a
model without exchanging training data. However, the limited access to fully
annotated training data poses a major challenge to training generalizable
models. We propose "ConDistFL", a framework to solve this problem by combining
FL with knowledge distillation. Local models can extract the knowledge of
unlabeled organs and tumors from partially annotated data from the global model
with an adequately designed conditional probability representation. We validate
our framework on four distinct partially annotated abdominal CT datasets from
the MSD and KiTS19 challenges. The experimental results show that the proposed
framework significantly outperforms FedAvg and FedOpt baselines. Moreover, the
performance on an external test dataset demonstrates superior generalizability
compared to models trained on each dataset separately. Our ablation study
suggests that ConDistFL can perform well without frequent aggregation, reducing
the communication cost of FL. Our implementation will be available at
https://github.com/NVIDIA/NVFlare/tree/dev/research/condist-fl
A global Approach to the Comparison of Clustering Results
Copyright © 2012 Walter de Gruyter GmbH.The discovery of knowledge in the case of Hierarchical Cluster Analysis (HCA) depends on many factors, such as the clustering algorithms applied and the strategies developed in the initialstage of Cluster Analysis. We present a global approach for evaluating the quality of clustering results and making a comparison among different clustering algorithms using the relevant information available (e.g. the stability, isolation and homogeneity of the clusters). In addition, we present a visual method to facilitate evaluation of the quality of the partitions, allowing identification of the similarities and differences between partitions, as well as the behaviour of the elements in the partitions. We illustrate our approach using a complex and heterogeneous dataset (real horse data) taken from the literature. We apply HCA based on the generalized affinity coefficient (similarity coefficient) to the case of complex data (symbolic data), combined with 26 (classic and probabilistic) clustering algorithms. Finally, we discuss the obtained results and the contribution of this approach to gaining better knowledge of the structure of data
- …