26 research outputs found
Realtime MEG source localization
Iterative gradient methods like Levenberg-Marquardt (LM) are in widespread use for source localization from electroencephalographic (EEG) and magnetoencephalographic (MEG) signals. Unfortunately LM depends sensitively on the initial guess, particularly (and counterintuitively) at higher signal-to-noise ratios, necessitating repeated runs. This, combined with LM's high per-step cost, makes its computational burden quite high. To reduce this burden, we trained a multilayer perceptron (MLP) as a real-time localizer. We used an analytical model of quasistatic electromagnetic propagation through the head to map randomly chosen dipoles to sensor activities, and trained an MLP to invert this mapping in the presence of various sorts of noise. With realistic noise, our MLP is about five hundred times faster than n-start-LM with n = 4 to match accuracies, while our hybrid MLP-start-LM is about four times more accurate and thirteen times faster than 4-start-LM
Realtime MEG source localization
Iterative gradient methods like Levenberg-Marquardt (LM) are in widespread use for source localization from electroencephalographic (EEG) and magnetoencephalographic (MEG) signals. Unfortunately LM depends sensitively on the initial guess, particularly (and counterintuitively) at higher signal-to-noise ratios, necessitating repeated runs. This, combined with LM's high per-step cost, makes its computational burden quite high. To reduce this burden, we trained a multilayer perceptron (MLP) as a real-time localizer. We used an analytical model of quasistatic electromagnetic propagation through the head to map randomly chosen dipoles to sensor activities, and trained an MLP to invert this mapping in the presence of various sorts of noise. With realistic noise, our MLP is about five hundred times faster than n-start-LM with n = 4 to match accuracies, while our hybrid MLP-start-LM is about four times more accurate and thirteen times faster than 4-start-LM
Unleashing the Potential of Regularization Strategies in Learning with Noisy Labels
In recent years, research on learning with noisy labels has focused on
devising novel algorithms that can achieve robustness to noisy training labels
while generalizing to clean data. These algorithms often incorporate
sophisticated techniques, such as noise modeling, label correction, and
co-training. In this study, we demonstrate that a simple baseline using
cross-entropy loss, combined with widely used regularization strategies like
learning rate decay, model weights average, and data augmentations, can
outperform state-of-the-art methods. Our findings suggest that employing a
combination of regularization strategies can be more effective than intricate
algorithms in tackling the challenges of learning with noisy labels. While some
of these regularization strategies have been utilized in previous noisy label
learning research, their full potential has not been thoroughly explored. Our
results encourage a reevaluation of benchmarks for learning with noisy labels
and prompt reconsideration of the role of specialized learning algorithms
designed for training with noisy labels
Fast Robust Subject-Independent Magnetoencephalographic Source Localization Using an Artificial Neural Network
We describe a system that localizes a single dipole to reasonable accuracy from noisy magnetoencephalographic
(MEG) measurements in real time. At its core is a multilayer perceptron (MLP) trained to map sensor signals and head position to dipole location. Including head position overcomes the previous need to retrain the MLP for each subject and session. The training dataset was generated by mapping randomly chosen dipoles and head positions through an analytic model and adding noise from
real MEG recordings. After training, a localization took 0.7 ms with an average error of 0.90 cm. A few
iterations of a Levenberg-Marquardt routine using the MLP output as its initial guess took 15 ms and improved accuracy to 0.53 cm, which approaches the natural limit on accuracy imposed by noise. We applied these methods to localize single dipole sources from MEG components isolated by blind source separation and compared the estimated locations to those generated by standard manually assisted
commercial software
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Conditioning analysis uncovers the landscape of an optimization objective by
exploring the spectrum of its curvature matrix. This has been well explored
theoretically for linear models. We extend this analysis to deep neural
networks (DNNs) in order to investigate their learning dynamics. To this end,
we propose layer-wise conditioning analysis, which explores the optimization
landscape with respect to each layer independently. Such an analysis is
theoretically supported under mild assumptions that approximately hold in
practice. Based on our analysis, we show that batch normalization (BN) can
stabilize the training, but sometimes result in the false impression of a local
minimum, which has detrimental effects on the learning. Besides, we
experimentally observe that BN can improve the layer-wise conditioning of the
optimization problem. Finally, we find that the last linear layer of a very
deep residual network displays ill-conditioned behavior. We solve this problem
by only adding one BN layer before the last linear layer, which achieves
improved performance over the original and pre-activation residual networks.Comment: Accepted to ECCV 2020. The code is available at:
https://github.com/huangleiBuaa/LayerwiseC
Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning
Meta-learning has arisen as a successful method for improving training
performance by training over many similar tasks, especially with deep neural
networks (DNNs). However, the theoretical understanding of when and why
overparameterized models such as DNNs can generalize well in meta-learning is
still limited. As an initial step towards addressing this challenge, this paper
studies the generalization performance of overfitted meta-learning under a
linear regression model with Gaussian features. In contrast to a few recent
studies along the same line, our framework allows the number of model
parameters to be arbitrarily larger than the number of features in the ground
truth signal, and hence naturally captures the overparameterized regime in
practical deep meta-learning. We show that the overfitted min -norm
solution of model-agnostic meta-learning (MAML) can be beneficial, which is
similar to the recent remarkable findings on ``benign overfitting'' and
``double descent'' phenomenon in the classical (single-task) linear regression.
However, due to the uniqueness of meta-learning such as task-specific gradient
descent inner training and the diversity/fluctuation of the ground-truth
signals among training tasks, we find new and interesting properties that do
not exist in single-task linear regression. We first provide a high-probability
upper bound (under reasonable tightness) on the generalization error, where
certain terms decrease when the number of features increases. Our analysis
suggests that benign overfitting is more significant and easier to observe when
the noise and the diversity/fluctuation of the ground truth of each training
task are large. Under this circumstance, we show that the overfitted min
-norm solution can achieve an even lower generalization error than the
underparameterized solution