57 research outputs found
Federated Pseudo Modality Generation for Incomplete Multi-Modal MRI Reconstruction
While multi-modal learning has been widely used for MRI reconstruction, it
relies on paired multi-modal data which is difficult to acquire in real
clinical scenarios. Especially in the federated setting, the common situation
is that several medical institutions only have single-modal data, termed the
modality missing issue. Therefore, it is infeasible to deploy a standard
federated learning framework in such conditions. In this paper, we propose a
novel communication-efficient federated learning framework, namely Fed-PMG, to
address the missing modality challenge in federated multi-modal MRI
reconstruction. Specifically, we utilize a pseudo modality generation mechanism
to recover the missing modality for each single-modal client by sharing the
distribution information of the amplitude spectrum in frequency space. However,
the step of sharing the original amplitude spectrum leads to heavy
communication costs. To reduce the communication cost, we introduce a
clustering scheme to project the set of amplitude spectrum into finite cluster
centroids, and share them among the clients. With such an elaborate design, our
approach can effectively complete the missing modality within an acceptable
communication cost. Extensive experiments demonstrate that our proposed method
can attain similar performance with the ideal scenario, i.e., all clients have
the full set of modalities. The source code will be released.Comment: 10 pages, 5 figures
Deep N-ary Error Correcting Output Codes
Ensemble learning consistently improves the performance of multi-class
classification through aggregating a series of base classifiers. To this end,
data-independent ensemble methods like Error Correcting Output Codes (ECOC)
attract increasing attention due to its easiness of implementation and
parallelization. Specifically, traditional ECOCs and its general extension
N-ary ECOC decompose the original multi-class classification problem into a
series of independent simpler classification subproblems. Unfortunately,
integrating ECOCs, especially N-ary ECOC with deep neural networks, termed as
deep N-ary ECOC, is not straightforward and yet fully exploited in the
literature, due to the high expense of training base learners. To facilitate
the training of N-ary ECOC with deep learning base learners, we further propose
three different variants of parameter sharing architectures for deep N-ary
ECOC. To verify the generalization ability of deep N-ary ECOC, we conduct
experiments by varying the backbone with different deep neural network
architectures for both image and text classification tasks. Furthermore,
extensive ablation studies on deep N-ary ECOC show its superior performance
over other deep data-independent ensemble methods.Comment: EAI MOBIMEDIA 202
Optimizing the MapReduce Framework on Intel Xeon Phi Coprocessor
With the ease-of-programming, flexibility and yet efficiency, MapReduce has
become one of the most popular frameworks for building big-data applications.
MapReduce was originally designed for distributed-computing, and has been
extended to various architectures, e,g, multi-core CPUs, GPUs and FPGAs. In
this work, we focus on optimizing the MapReduce framework on Xeon Phi, which is
the latest product released by Intel based on the Many Integrated Core
Architecture. To the best of our knowledge, this is the first work to optimize
the MapReduce framework on the Xeon Phi.
In our work, we utilize advanced features of the Xeon Phi to achieve high
performance. In order to take advantage of the SIMD vector processing units, we
propose a vectorization friendly technique for the map phase to assist the
auto-vectorization as well as develop SIMD hash computation algorithms.
Furthermore, we utilize MIMD hyper-threading to pipeline the map and reduce to
improve the resource utilization. We also eliminate multiple local arrays but
use low cost atomic operations on the global array for some applications, which
can improve the thread scalability and data locality due to the coherent L2
caches. Finally, for a given application, our framework can either
automatically detect suitable techniques to apply or provide guideline for
users at compilation time. We conduct comprehensive experiments to benchmark
the Xeon Phi and compare our optimized MapReduce framework with a
state-of-the-art multi-core based MapReduce framework (Phoenix++). By
evaluating six real-world applications, the experimental results show that our
optimized framework is 1.2X to 38X faster than Phoenix++ for various
applications on the Xeon Phi
- …