18 research outputs found
Hierarchical Deep Feature Learning For Decoding Imagined Speech From EEG
We propose a mixed deep neural network strategy, incorporating parallel
combination of Convolutional (CNN) and Recurrent Neural Networks (RNN),
cascaded with deep autoencoders and fully connected layers towards automatic
identification of imagined speech from EEG. Instead of utilizing raw EEG
channel data, we compute the joint variability of the channels in the form of a
covariance matrix that provide spatio-temporal representations of EEG. The
networks are trained hierarchically and the extracted features are passed onto
the next network hierarchy until the final classification. Using a publicly
available EEG based speech imagery database we demonstrate around 23.45%
improvement of accuracy over the baseline method. Our approach demonstrates the
promise of a mixed DNN approach for complex spatial-temporal classification
problems.Comment: Accepted in AAAI 2019 under Student Abstract and Poster Progra
Towards Automatic Speech Identification from Vocal Tract Shape Dynamics in Real-time MRI
Vocal tract configurations play a vital role in generating distinguishable
speech sounds, by modulating the airflow and creating different resonant
cavities in speech production. They contain abundant information that can be
utilized to better understand the underlying speech production mechanism. As a
step towards automatic mapping of vocal tract shape geometry to acoustics, this
paper employs effective video action recognition techniques, like Long-term
Recurrent Convolutional Networks (LRCN) models, to identify different
vowel-consonant-vowel (VCV) sequences from dynamic shaping of the vocal tract.
Such a model typically combines a CNN based deep hierarchical visual feature
extractor with Recurrent Networks, that ideally makes the network
spatio-temporally deep enough to learn the sequential dynamics of a short video
clip for video classification tasks. We use a database consisting of 2D
real-time MRI of vocal tract shaping during VCV utterances by 17 speakers. The
comparative performances of this class of algorithms under various parameter
settings and for various classification tasks are discussed. Interestingly, the
results show a marked difference in the model performance in the context of
speech classification with respect to generic sequence or video classification
tasks.Comment: To appear in the INTERSPEECH 2018 Proceeding
Rethinking Semi-Supervised Federated Learning: How to co-train fully-labeled and fully-unlabeled client imaging data
The most challenging, yet practical, setting of semi-supervised federated
learning (SSFL) is where a few clients have fully labeled data whereas the
other clients have fully unlabeled data. This is particularly common in
healthcare settings where collaborating partners (typically hospitals) may have
images but not annotations. The bottleneck in this setting is the joint
training of labeled and unlabeled clients as the objective function for each
client varies based on the availability of labels. This paper investigates an
alternative way for effective training with labeled and unlabeled clients in a
federated setting. We propose a novel learning scheme specifically designed for
SSFL which we call Isolated Federated Learning (IsoFed) that circumvents the
problem by avoiding simple averaging of supervised and semi-supervised models
together. In particular, our training approach consists of two parts - (a)
isolated aggregation of labeled and unlabeled client models, and (b) local
self-supervised pretraining of isolated global models in all clients. We
evaluate our model performance on medical image datasets of four different
modalities publicly available within the biomedical image classification
benchmark MedMNIST. We further vary the proportion of labeled clients and the
degree of heterogeneity to demonstrate the effectiveness of the proposed method
under varied experimental settings.Comment: Published in MICCAI 2023 with early acceptance and selected as 1 of
the top 20 poster highlights under the category: Which work has the potential
to impact other applications of AI and C
Post-Deployment Adaptation with Access to Source Data via Federated Learning and Source-Target Remote Gradient Alignment
Deployment of Deep Neural Networks in medical imaging is hindered by
distribution shift between training data and data processed after deployment,
causing performance degradation. Post-Deployment Adaptation (PDA) addresses
this by tailoring a pre-trained, deployed model to the target data distribution
using limited labelled or entirely unlabelled target data, while assuming no
access to source training data as they cannot be deployed with the model due to
privacy concerns and their large size. This makes reliable adaptation
challenging due to limited learning signal. This paper challenges this
assumption and introduces FedPDA, a novel adaptation framework that brings the
utility of learning from remote data from Federated Learning into PDA. FedPDA
enables a deployed model to obtain information from source data via remote
gradient exchange, while aiming to optimize the model specifically for the
target domain. Tailored for FedPDA, we introduce a novel optimization method
StarAlign (Source-Target Remote Gradient Alignment) that aligns gradients
between source-target domain pairs by maximizing their inner product, to
facilitate learning a target-specific model. We demonstrate the method's
effectiveness using multi-center databases for the tasks of cancer metastases
detection and skin lesion classification, where our method compares favourably
to previous work. Code is available at: https://github.com/FelixWag/StarAlignComment: This version was accepted for the Machine Learning in Medical Imaging
(MLMI 2023) workshop at MICCAI 202