363 research outputs found
AMPLIFY:Attention-based Mixup for Performance Improvement and Label Smoothing in Transformer
Mixup is an effective data augmentation method that generates new augmented
samples by aggregating linear combinations of different original samples.
However, if there are noises or aberrant features in the original samples,
Mixup may propagate them to the augmented samples, leading to over-sensitivity
of the model to these outliers . To solve this problem, this paper proposes a
new Mixup method called AMPLIFY. This method uses the Attention mechanism of
Transformer itself to reduce the influence of noises and aberrant values in the
original samples on the prediction results, without increasing additional
trainable parameters, and the computational cost is very low, thereby avoiding
the problem of high resource consumption in common Mixup methods such as
Sentence Mixup . The experimental results show that, under a smaller
computational resource cost, AMPLIFY outperforms other Mixup methods in text
classification tasks on 7 benchmark datasets, providing new ideas and new ways
to further improve the performance of pre-trained models based on the Attention
mechanism, such as BERT, ALBERT, RoBERTa, and GPT. Our code can be obtained at
https://github.com/kiwi-lilo/AMPLIFY
iMetricGAN: Intelligibility Enhancement for Speech-in-Noise using Generative Adversarial Network-based Metric Learning
The intelligibility of natural speech is seriously degraded when exposed to
adverse noisy environments. In this work, we propose a deep learning-based
speech modification method to compensate for the intelligibility loss, with the
constraint that the root mean square (RMS) level and duration of the speech
signal are maintained before and after modifications. Specifically, we utilize
an iMetricGAN approach to optimize the speech intelligibility metrics with
generative adversarial networks (GANs). Experimental results show that the
proposed iMetricGAN outperforms conventional state-of-the-art algorithms in
terms of objective measures, i.e., speech intelligibility in bits (SIIB) and
extended short-time objective intelligibility (ESTOI), under a Cafeteria noise
condition. In addition, formal listening tests reveal significant
intelligibility gains when both noise and reverberation exist.Comment: 5 pages, Submitted to INTERSPEECH 202
Learning a Stable Dynamic System with a Lyapunov Energy Function for Demonstratives Using Neural Networks
Autonomous Dynamic System (DS)-based algorithms hold a pivotal and
foundational role in the field of Learning from Demonstration (LfD).
Nevertheless, they confront the formidable challenge of striking a delicate
balance between achieving precision in learning and ensuring the overall
stability of the system. In response to this substantial challenge, this paper
introduces a novel DS algorithm rooted in neural network technology. This
algorithm not only possesses the capability to extract critical insights from
demonstration data but also demonstrates the capacity to learn a candidate
Lyapunov energy function that is consistent with the provided data. The model
presented in this paper employs a straightforward neural network architecture
that excels in fulfilling a dual objective: optimizing accuracy while
simultaneously preserving global stability. To comprehensively evaluate the
effectiveness of the proposed algorithm, rigorous assessments are conducted
using the LASA dataset, further reinforced by empirical validation through a
robotic experiment
- …