164 research outputs found

    SNR-Based Teachers-Student Technique for Speech Enhancement

    Full text link
    It is very challenging for speech enhancement methods to achieves robust performance under both high signal-to-noise ratio (SNR) and low SNR simultaneously. In this paper, we propose a method that integrates an SNR-based teachers-student technique and time-domain U-Net to deal with this problem. Specifically, this method consists of multiple teacher models and a student model. We first train the teacher models under multiple small-range SNRs that do not coincide with each other so that they can perform speech enhancement well within the specific SNR range. Then, we choose different teacher models to supervise the training of the student model according to the SNR of the training data. Eventually, the student model can perform speech enhancement under both high SNR and low SNR. To evaluate the proposed method, we constructed a dataset with an SNR ranging from -20dB to 20dB based on the public dataset. We experimentally analyzed the effectiveness of the SNR-based teachers-student technique and compared the proposed method with several state-of-the-art methods.Comment: Published in 2020 IEEE International Conference on Multimedia and Expo (ICME 2020

    Sub-Band Knowledge Distillation Framework for Speech Enhancement

    Full text link
    In single-channel speech enhancement, methods based on full-band spectral features have been widely studied. However, only a few methods pay attention to non-full-band spectral features. In this paper, we explore a knowledge distillation framework based on sub-band spectral mapping for single-channel speech enhancement. Specifically, we divide the full frequency band into multiple sub-bands and pre-train an elite-level sub-band enhancement model (teacher model) for each sub-band. These teacher models are dedicated to processing their own sub-bands. Next, under the teacher models' guidance, we train a general sub-band enhancement model (student model) that works for all sub-bands. Without increasing the number of model parameters and computational complexity, the student model's performance is further improved. To evaluate our proposed method, we conducted a large number of experiments on an open-source data set. The final experimental results show that the guidance from the elite-level teacher models dramatically improves the student model's performance, which exceeds the full-band model by employing fewer parameters.Comment: Published in Interspeech 202

    Improving CTC-AED model with integrated-CTC and auxiliary loss regularization

    Full text link
    Connectionist temporal classification (CTC) and attention-based encoder decoder (AED) joint training has been widely applied in automatic speech recognition (ASR). Unlike most hybrid models that separately calculate the CTC and AED losses, our proposed integrated-CTC utilizes the attention mechanism of AED to guide the output of CTC. In this paper, we employ two fusion methods, namely direct addition of logits (DAL) and preserving the maximum probability (PMP). We achieve dimensional consistency by adaptively affine transforming the attention results to match the dimensions of CTC. To accelerate model convergence and improve accuracy, we introduce auxiliary loss regularization for accelerated convergence. Experimental results demonstrate that the DAL method performs better in attention rescoring, while the PMP method excels in CTC prefix beam search and greedy search

    NATURE OF THE LATE CARBONIFEROUS TO TRIASSIC MAGMATISM ALONG THE NORTHERN MARGIN OF THE NORTH CHINA BLOCK: LINK WITH THE EVOLUTION OF THE CENTRAL ASIAN OROGEN

    Get PDF
    There are two episodes of magmatism along the northern margin of the North China block during the Late Carboniferous to Late Triassic, one at 310–250 Ma (Late Carboniferous to Permian) and the other at 235–210 Ma (Late Triassic). The former group comprises plutonic rocks (gabbro-diorite-monzodioritemonzogranite-granite), mafic to intermediate dykes (diorite to dolerite) and a few felsic volcanics (andesite to dacite).There are two episodes of magmatism along the northern margin of the North China block during the Late Carboniferous to Late Triassic, one at 310–250 Ma (Late Carboniferous to Permian) and the other at 235–210 Ma (Late Triassic). The former group comprises plutonic rocks (gabbro-diorite-monzodioritemonzogranite-granite), mafic to intermediate dykes (diorite to dolerite) and a few felsic volcanics (andesite to dacite)

    Leading Power Accuracy in Lattice Calculations of Parton Distributions

    Full text link
    In lattice-QCD calculations of parton distribution functions (PDFs) via large-momentum effective theory, the leading power (twist-three) correction appears as O(ΛQCD/Pz){\cal O}(\Lambda_{\rm QCD}/P^z) due to the linear-divergent self-energy of Wilson line in quasi-PDF operators. For lattice data with hadron momentum PzP^z of a few GeV, this correction is dominant in matching, as large as 30\% or more. We show how to eliminate this uncertainty through choosing the mass renormalization parameter consistently with the resummation scheme of the infrared-renormalon series in perturbative matching coefficients. An example on the lattice pion PDF data at Pz=1.9P^z = 1.9 GeV shows an improvement of matching accuracy by a factor of more than 3∼53\sim 5 in the expansion region x=0.2∼0.5x= 0.2\sim 0.5.Comment: Updated to version published on PL

    1-Benzyl-2-phenyl-1H-benzimidazole–4,4′-(cyclo­hexane-1,1-di­yl)diphenol (1/1)

    Get PDF
    The asymmetric unit of the title co-crystal, C20H16N2·C18H20O2, contains one mol­ecule of 4,4′-(cyclo­hexane-1,1-di­yl)diphenol (in which the cyclo­hexane ring adopts a chair conformation) and one mol­ecule of 1-benzyl-2-phenyl-1H-benzimidazole, which are paired through an O—H⋯N hydrogen bond. These pairs are further linked by inter­molecular O—H⋯O hydrogen bonds into chains along [010]. Weak inter­molecular C—H⋯O and C—H⋯π inter­actions further consolidate the crystal packing. The dihedral angles between the pendant phenyl rings and the benzimidazole ring are 86.9 (2) and 43.1 (2)°

    Phase Reversal Diffraction in incoherent light

    Full text link
    Phase reversal occurs in the propagation of an electromagnetic wave in a negatively refracting medium or a phase-conjugate interface. Here we report the experimental observation of phase reversal diffraction without the above devices. Our experimental results and theoretical analysis demonstrate that phase reversal diffraction can be formed through the first-order field correlation of chaotic light. The experimental realization is similar to phase reversal behavior in negatively refracting media.Comment: 8 pages, 5 figure

    UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition

    Full text link
    Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is a very challenging problem and rarely investigated in previous works. This paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net and generative adversarial learning to deal with this problem. This approach consists of a generator network and a discriminator network, which operate directly in the time domain. The generator network adopts a U-Net like structure and employs dilated convolution in the bottleneck of it. We evaluate the performance of the UNetGAN at low SNR conditions (up to -20dB) on the public benchmark. The result demonstrates that it significantly improves the speech quality and substantially outperforms the representative deep learning models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech quality (PESQ).Comment: Published in Interspeech 201

    Synthesis and in vitro antimicrobial SAR of benzyl and phenyl guanidine and aminoguanidine hydrazone derivatives

    Get PDF
    A series of benzyl, phenyl guanidine, and aminoguandine hydrazone derivatives was designed and in vitro antibacterial activities against two different bacterial strains (Staphylococcus aureus and Escherichia coli) were determined. Several compounds showed potent inhibitory activity against the bacterial strains evaluated, with minimal inhibitory concentration (MIC) values in the low µg/mL range. Of all guanidine derivatives, 3-[2-chloro-3-(trifluoromethyl)]-benzyloxy derivative 9m showed the best potency with MICs of 0.5 µg/mL (S. aureus) and 1 µg/mL (E. coli), respectively. Several aminoguanidine hydrazone derivatives also showed good overall activity. Compounds 10a, 10j, and 10r–s displayed MICs of 4 µg/mL against both S. aureus and E. coli. In the aminoguanidine hydrazone series, 3-(4-trifluoromethyl)-benzyloxy derivative 10d showed the best potency against S. aureus (MIC 1 µg/mL) but was far less active against E. coli (MIC 16 µg/mL). Compound 9m and the para-substituted derivative 9v also showed promising results against two strains of methicillin-resistant Staphylococcus aureus (MRSA). These results provide new and potent structural leads for further antibiotic optimisation strategies
    • …
    corecore