164 research outputs found
SNR-Based Teachers-Student Technique for Speech Enhancement
It is very challenging for speech enhancement methods to achieves robust
performance under both high signal-to-noise ratio (SNR) and low SNR
simultaneously. In this paper, we propose a method that integrates an SNR-based
teachers-student technique and time-domain U-Net to deal with this problem.
Specifically, this method consists of multiple teacher models and a student
model. We first train the teacher models under multiple small-range SNRs that
do not coincide with each other so that they can perform speech enhancement
well within the specific SNR range. Then, we choose different teacher models to
supervise the training of the student model according to the SNR of the
training data. Eventually, the student model can perform speech enhancement
under both high SNR and low SNR. To evaluate the proposed method, we
constructed a dataset with an SNR ranging from -20dB to 20dB based on the
public dataset. We experimentally analyzed the effectiveness of the SNR-based
teachers-student technique and compared the proposed method with several
state-of-the-art methods.Comment: Published in 2020 IEEE International Conference on Multimedia and
Expo (ICME 2020
Sub-Band Knowledge Distillation Framework for Speech Enhancement
In single-channel speech enhancement, methods based on full-band spectral
features have been widely studied. However, only a few methods pay attention to
non-full-band spectral features. In this paper, we explore a knowledge
distillation framework based on sub-band spectral mapping for single-channel
speech enhancement. Specifically, we divide the full frequency band into
multiple sub-bands and pre-train an elite-level sub-band enhancement model
(teacher model) for each sub-band. These teacher models are dedicated to
processing their own sub-bands. Next, under the teacher models' guidance, we
train a general sub-band enhancement model (student model) that works for all
sub-bands. Without increasing the number of model parameters and computational
complexity, the student model's performance is further improved. To evaluate
our proposed method, we conducted a large number of experiments on an
open-source data set. The final experimental results show that the guidance
from the elite-level teacher models dramatically improves the student model's
performance, which exceeds the full-band model by employing fewer parameters.Comment: Published in Interspeech 202
Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Connectionist temporal classification (CTC) and attention-based encoder
decoder (AED) joint training has been widely applied in automatic speech
recognition (ASR). Unlike most hybrid models that separately calculate the CTC
and AED losses, our proposed integrated-CTC utilizes the attention mechanism of
AED to guide the output of CTC. In this paper, we employ two fusion methods,
namely direct addition of logits (DAL) and preserving the maximum probability
(PMP). We achieve dimensional consistency by adaptively affine transforming the
attention results to match the dimensions of CTC. To accelerate model
convergence and improve accuracy, we introduce auxiliary loss regularization
for accelerated convergence. Experimental results demonstrate that the DAL
method performs better in attention rescoring, while the PMP method excels in
CTC prefix beam search and greedy search
NATURE OF THE LATE CARBONIFEROUS TO TRIASSIC MAGMATISM ALONG THE NORTHERN MARGIN OF THE NORTH CHINA BLOCK: LINK WITH THE EVOLUTION OF THE CENTRAL ASIAN OROGEN
There are two episodes of magmatism along the northern margin of the North China block during the Late Carboniferous to Late Triassic, one at 310–250 Ma (Late Carboniferous to Permian) and the other at 235–210 Ma (Late Triassic). The former group comprises plutonic rocks (gabbro-diorite-monzodioritemonzogranite-granite), mafic to intermediate dykes (diorite to dolerite) and a few felsic volcanics (andesite to dacite).There are two episodes of magmatism along the northern margin of the North China block during the Late Carboniferous to Late Triassic, one at 310–250 Ma (Late Carboniferous to Permian) and the other at 235–210 Ma (Late Triassic). The former group comprises plutonic rocks (gabbro-diorite-monzodioritemonzogranite-granite), mafic to intermediate dykes (diorite to dolerite) and a few felsic volcanics (andesite to dacite)
Leading Power Accuracy in Lattice Calculations of Parton Distributions
In lattice-QCD calculations of parton distribution functions (PDFs) via
large-momentum effective theory, the leading power (twist-three) correction
appears as due to the linear-divergent
self-energy of Wilson line in quasi-PDF operators. For lattice data with hadron
momentum of a few GeV, this correction is dominant in matching, as large
as 30\% or more. We show how to eliminate this uncertainty through choosing the
mass renormalization parameter consistently with the resummation scheme of the
infrared-renormalon series in perturbative matching coefficients. An example on
the lattice pion PDF data at GeV shows an improvement of matching
accuracy by a factor of more than in the expansion region .Comment: Updated to version published on PL
1-Benzyl-2-phenyl-1H-benzimidazole–4,4′-(cycloÂhexane-1,1-diÂyl)diphenol (1/1)
The asymmetric unit of the title co-crystal, C20H16N2·C18H20O2, contains one molÂecule of 4,4′-(cycloÂhexane-1,1-diÂyl)diphenol (in which the cycloÂhexane ring adopts a chair conformation) and one molÂecule of 1-benzyl-2-phenyl-1H-benzimidazole, which are paired through an O—H⋯N hydrogen bond. These pairs are further linked by interÂmolecular O—H⋯O hydrogen bonds into chains along [010]. Weak interÂmolecular C—H⋯O and C—H⋯π interÂactions further consolidate the crystal packing. The dihedral angles between the pendant phenyl rings and the benzimidazole ring are 86.9 (2) and 43.1 (2)°
Phase Reversal Diffraction in incoherent light
Phase reversal occurs in the propagation of an electromagnetic wave in a
negatively refracting medium or a phase-conjugate interface. Here we report the
experimental observation of phase reversal diffraction without the above
devices. Our experimental results and theoretical analysis demonstrate that
phase reversal diffraction can be formed through the first-order field
correlation of chaotic light. The experimental realization is similar to phase
reversal behavior in negatively refracting media.Comment: 8 pages, 5 figure
UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition
Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is
a very challenging problem and rarely investigated in previous works. This
paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net
and generative adversarial learning to deal with this problem. This approach
consists of a generator network and a discriminator network, which operate
directly in the time domain. The generator network adopts a U-Net like
structure and employs dilated convolution in the bottleneck of it. We evaluate
the performance of the UNetGAN at low SNR conditions (up to -20dB) on the
public benchmark. The result demonstrates that it significantly improves the
speech quality and substantially outperforms the representative deep learning
models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive
spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding
Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech
quality (PESQ).Comment: Published in Interspeech 201
Synthesis and in vitro antimicrobial SAR of benzyl and phenyl guanidine and aminoguanidine hydrazone derivatives
A series of benzyl, phenyl guanidine, and aminoguandine hydrazone derivatives was designed and in vitro antibacterial activities against two different bacterial strains (Staphylococcus aureus and Escherichia coli) were determined. Several compounds showed potent inhibitory activity against the bacterial strains evaluated, with minimal inhibitory concentration (MIC) values in the low µg/mL range. Of all guanidine derivatives, 3-[2-chloro-3-(trifluoromethyl)]-benzyloxy derivative 9m showed the best potency with MICs of 0.5 µg/mL (S. aureus) and 1 µg/mL (E. coli), respectively. Several aminoguanidine hydrazone derivatives also showed good overall activity. Compounds 10a, 10j, and 10r–s displayed MICs of 4 µg/mL against both S. aureus and E. coli. In the aminoguanidine hydrazone series, 3-(4-trifluoromethyl)-benzyloxy derivative 10d showed the best potency against S. aureus (MIC 1 µg/mL) but was far less active against E. coli (MIC 16 µg/mL). Compound 9m and the para-substituted derivative 9v also showed promising results against two strains of methicillin-resistant Staphylococcus aureus (MRSA). These results provide new and potent structural leads for further antibiotic optimisation strategies
- …