200 research outputs found
Biologically inspired learning in a layered neural net
A feed-forward neural net with adaptable synaptic weights and fixed, zero or
non-zero threshold potentials is studied, in the presence of a global feedback
signal that can only have two values, depending on whether the output of the
network in reaction to its input is right or wrong.
It is found, on the basis of four biologically motivated assumptions, that
only two forms of learning are possible, Hebbian and Anti-Hebbian learning.
Hebbian learning should take place when the output is right, while there should
be Anti-Hebbian learning when the output is wrong.
For the Anti-Hebbian part of the learning rule a particular choice is made,
which guarantees an adequate average neuronal activity without the need of
introducing, by hand, control mechanisms like extremal dynamics. A network with
realistic, i.e., non-zero threshold potentials is shown to perform its task of
realizing the desired input-output relations best if it is sufficiently
diluted, i.e. if only a relatively low fraction of all possible synaptic
connections is realized
On Batching Variable Size Inputs for Training End-to-End Speech Enhancement Systems
The performance of neural network-based speech enhancement systems is
primarily influenced by the model architecture, whereas training times and
computational resource utilization are primarily affected by training
parameters such as the batch size. Since noisy and reverberant speech mixtures
can have different duration, a batching strategy is required to handle variable
size inputs during training, in particular for state-of-the-art end-to-end
systems. Such strategies usually strive a compromise between zero-padding and
data randomization, and can be combined with a dynamic batch size for a more
consistent amount of data in each batch. However, the effect of these practices
on resource utilization and more importantly network performance is not well
documented. This paper is an empirical study of the effect of different
batching strategies and batch sizes on the training statistics and speech
enhancement performance of a Conv-TasNet, evaluated in both matched and
mismatched conditions. We find that using a small batch size during training
improves performance in both conditions for all batching strategies. Moreover,
using sorted or bucket batching with a dynamic batch size allows for reduced
training time and GPU memory usage while achieving similar performance compared
to random batching with a fixed batch size
Multi-view self-supervised learning for multivariate variable-channel time series
Labeling of multivariate biomedical time series data is a laborious and
expensive process. Self-supervised contrastive learning alleviates the need for
large, labeled datasets through pretraining on unlabeled data. However, for
multivariate time series data, the set of input channels often varies between
applications, and most existing work does not allow for transfer between
datasets with different sets of input channels. We propose learning one encoder
to operate on all input channels individually. We then use a message passing
neural network to extract a single representation across channels. We
demonstrate the potential of this method by pretraining our model on a dataset
with six EEG channels and then fine-tuning it on a dataset with two different
EEG channels. We compare models with and without the message passing neural
network across different contrastive loss functions. We show that our method,
combined with the TS2Vec loss, outperforms all other methods in most settings.Comment: To appear in proceedings of 2023 IEEE International workshop on
Machine Learning for Signal Processin
Evolution and extinction dynamics in rugged fitness landscapes
Macroevolution is considered as a problem of stochastic dynamics in a system
with many competing agents. Evolutionary events (speciations and extinctions)
are triggered by fitness records found by random exploration of the agents'
fitness landscapes. As a consequence, the average fitness in the system
increases logarithmically with time, while the rate of extinction steadily
decreases. This dynamics is studied by numerical simulations and, in a simpler
mean field version, analytically. We also study the effect of externally added
`mass' extinctions. The predictions for various quantities of paleontological
interest (life-time distributions, distribution of event sizes and behavior of
the rate of extinction) are robust and in good agreement with available data.
Brief version of parts of this work have been published as Letters. (PRL 75,
2055, (1995) and PRL, 79, 1413, (1997))Comment: 30 pages 9 figures LaTe
Average patterns of spatiotemporal chaos: A boundary effect
Chaotic pattern dynamics in many experimental systems show structured time averages. We suggest that simple universal boundary effects underly this phenomenon and exemplify them with the Kuramoto-Sivashinsky equation in a finite domain. As in the experiments, averaged patterns in the equation recover global symmetries locally broken in the chaotic field. Plateaus in the average pattern wave number as a function of the system size are observed and studied and the different behaviors at the central and boundary regions are discussed. Finally, the structure strength of average patterns is investigated as a function of system size.We acknowledge the ďŹnancial support of the Spanish Direcci´on General de Investigaci´on Cient´ĹďŹca y T´ecnica, contract numbers PB94-1167 and PB94-1172.Peer Reviewe
On the effectiveness of partial variance reduction in federated learning with heterogeneous data
Data heterogeneity across clients is a key challenge in federated learning.
Prior works address this by either aligning client and server models or using
control variates to correct client model drift. Although these methods achieve
fast convergence in convex or simple non-convex problems, the performance in
over-parameterized models such as deep neural networks is lacking. In this
paper, we first revisit the widely used FedAvg algorithm in a deep neural
network to understand how data heterogeneity influences the gradient updates
across the neural network layers. We observe that while the feature extraction
layers are learned efficiently by FedAvg, the substantial diversity of the
final classification layers across clients impedes the performance. Motivated
by this, we propose to correct model drift by variance reduction only on the
final layers. We demonstrate that this significantly outperforms existing
benchmarks at a similar or lower communication cost. We furthermore provide
proof for the convergence rate of our algorithm.Comment: Accepted to CVPR 202
Sandpile avalanche dynamics on scale-free networks
Avalanche dynamics is an indispensable feature of complex systems. Here we
study the self-organized critical dynamics of avalanches on scale-free networks
with degree exponent through the Bak-Tang-Wiesenfeld (BTW) sandpile
model. The threshold height of a node is set as with
, where is the degree of node . Using the branching
process approach, we obtain the avalanche size and the duration distribution of
sand toppling, which follow power-laws with exponents and ,
respectively. They are given as and
for , 3/2 and 2 for
, respectively. The power-law distributions are modified by a
logarithmic correction at .Comment: 8 pages, elsart styl
Modeling, Analysis and Optimization of the Thermal Performance of Air Conditioners
Uncertainty estimation is important for interpreting the trustworthiness of
machine learning models in many applications. This is especially critical in
the data-driven active learning setting where the goal is to achieve a certain
accuracy with minimum labeling effort. In such settings, the model learns to
select the most informative unlabeled samples for annotation based on its
estimated uncertainty. The highly uncertain predictions are assumed to be more
informative for improving model performance. In this paper, we explore
uncertainty calibration within an active learning framework for medical image
segmentation, an area where labels often are scarce. Various uncertainty
estimation methods and acquisition strategies (regions and full images) are
investigated. We observe that selecting regions to annotate instead of full
images leads to more well-calibrated models. Additionally, we experimentally
show that annotating regions can cut 50% of pixels that need to be labeled by
humans compared to annotating full images.Comment: Presented at ICML 2020 Workshop on Uncertainty & Robustness in Deep
Learnin
- âŚ