Search CORE

164 research outputs found

SNR-Based Teachers-Student Technique for Speech Enhancement

Author: Gao Guanglai
Hao Xiang
Su Xiangdong
Wang Zhiyu
Xu Huali
Zhang Qiang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/10/2020
Field of study

It is very challenging for speech enhancement methods to achieves robust performance under both high signal-to-noise ratio (SNR) and low SNR simultaneously. In this paper, we propose a method that integrates an SNR-based teachers-student technique and time-domain U-Net to deal with this problem. Specifically, this method consists of multiple teacher models and a student model. We first train the teacher models under multiple small-range SNRs that do not coincide with each other so that they can perform speech enhancement well within the specific SNR range. Then, we choose different teacher models to supervise the training of the student model according to the SNR of the training data. Eventually, the student model can perform speech enhancement under both high SNR and low SNR. To evaluate the proposed method, we constructed a dataset with an SNR ranging from -20dB to 20dB based on the public dataset. We experimentally analyzed the effectiveness of the SNR-based teachers-student technique and compared the proposed method with several state-of-the-art methods.Comment: Published in 2020 IEEE International Conference on Multimedia and Expo (ICME 2020

arXiv.org e-Print Archive

Crossref

Sub-Band Knowledge Distillation Framework for Speech Enhancement

Author: Gao Guanglai
Hao Xiang
Li Xiaofei
Liu Yun
Su Xiangdong
Wen Shixue
Publication venue: 'International Speech Communication Association'
Publication date: 29/10/2020
Field of study

In single-channel speech enhancement, methods based on full-band spectral features have been widely studied. However, only a few methods pay attention to non-full-band spectral features. In this paper, we explore a knowledge distillation framework based on sub-band spectral mapping for single-channel speech enhancement. Specifically, we divide the full frequency band into multiple sub-bands and pre-train an elite-level sub-band enhancement model (teacher model) for each sub-band. These teacher models are dedicated to processing their own sub-bands. Next, under the teacher models' guidance, we train a general sub-band enhancement model (student model) that works for all sub-bands. Without increasing the number of model parameters and computational complexity, the student model's performance is further improved. To evaluate our proposed method, we conducted a large number of experiments on an open-source data set. The final experimental results show that the guidance from the elite-level teacher models dramatically improves the student model's performance, which exceeds the full-band model by employing fewer parameters.Comment: Published in Interspeech 202

arXiv.org e-Print Archive

Crossref

Improving CTC-AED model with integrated-CTC and auxiliary loss regularization

Author: Su Xiangdong
Zhang Hongbin
Zhu Daobin
Publication venue
Publication date: 14/08/2023
Field of study

Connectionist temporal classification (CTC) and attention-based encoder decoder (AED) joint training has been widely applied in automatic speech recognition (ASR). Unlike most hybrid models that separately calculate the CTC and AED losses, our proposed integrated-CTC utilizes the attention mechanism of AED to guide the output of CTC. In this paper, we employ two fusion methods, namely direct addition of logits (DAL) and preserving the maximum probability (PMP). We achieve dimensional consistency by adaptively affine transforming the attention results to match the dimensions of CTC. To accelerate model convergence and improve accuracy, we introduce auxiliary loss regularization for accelerated convergence. Experimental results demonstrate that the DAL method performs better in attention rescoring, while the PMP method excels in CTC prefix beam search and greedy search

arXiv.org e-Print Archive

NATURE OF THE LATE CARBONIFEROUS TO TRIASSIC MAGMATISM ALONG THE NORTHERN MARGIN OF THE NORTH CHINA BLOCK: LINK WITH THE EVOLUTION OF THE CENTRAL ASIAN OROGEN

Author: Mingguo Zhai
Mingguo Zhai
Peng Peng
Peng Peng
Taiping Zhao
Taiping Zhao
Xiangdong Su
Xiangdong Su
Yanyan Zhou
Yanyan Zhou
Publication venue: 'Institute of Earth''s Crust, Siberian Branch of the Russian Academy of Sciences'
Publication date: 12/10/2017
Field of study

There are two episodes of magmatism along the northern margin of the North China block during the Late Carboniferous to Late Triassic, one at 310–250 Ma (Late Carboniferous to Permian) and the other at 235–210 Ma (Late Triassic). The former group comprises plutonic rocks (gabbro-diorite-monzodioritemonzogranite-granite), mafic to intermediate dykes (diorite to dolerite) and a few felsic volcanics (andesite to dacite).There are two episodes of magmatism along the northern margin of the North China block during the Late Carboniferous to Late Triassic, one at 310–250 Ma (Late Carboniferous to Permian) and the other at 235–210 Ma (Late Triassic). The former group comprises plutonic rocks (gabbro-diorite-monzodioritemonzogranite-granite), mafic to intermediate dykes (diorite to dolerite) and a few felsic volcanics (andesite to dacite)

Geodynamics & Tectonophysics (E-Journal) / Геодинамика и тектонофизика

Leading Power Accuracy in Lattice Calculations of Parton Distributions

Author: Holligan Jack
Ji Xiangdong
Su Yushan
Zhang Rui
Publication venue
Publication date: 20/07/2023
Field of study

In lattice-QCD calculations of parton distribution functions (PDFs) via large-momentum effective theory, the leading power (twist-three) correction appears as

{\cal O}(\Lambda_{\rm QCD}/P^z)

due to the linear-divergent self-energy of Wilson line in quasi-PDF operators. For lattice data with hadron momentum

P^z

of a few GeV, this correction is dominant in matching, as large as 30\% or more. We show how to eliminate this uncertainty through choosing the mass renormalization parameter consistently with the resummation scheme of the infrared-renormalon series in perturbative matching coefficients. An example on the lattice pion PDF data at

P^z = 1.9

GeV shows an improvement of matching accuracy by a factor of more than

3\sim 5

in the expansion region

x= 0.2\sim 0.5

.Comment: Updated to version published on PL

arXiv.org e-Print Archive

1-Benzyl-2-phenyl-1H-benzimidazole–4,4′-(cyclohexane-1,1-diyl)diphenol (1/1)

Author: Ge Chunhua
Li Su
Zhang Rui
Zhang Xiangdong
Publication venue: International Union of Crystallography
Publication date: 01/07/2011
Field of study

The asymmetric unit of the title co-crystal, C20H16N2·C18H20O2, contains one molecule of 4,4′-(cyclohexane-1,1-diyl)diphenol (in which the cyclohexane ring adopts a chair conformation) and one molecule of 1-benzyl-2-phenyl-1H-benzimidazole, which are paired through an O—H⋯N hydrogen bond. These pairs are further linked by intermolecular O—H⋯O hydrogen bonds into chains along [010]. Weak intermolecular C—H⋯O and C—H⋯π interactions further consolidate the crystal packing. The dihedral angles between the pendant phenyl rings and the benzimidazole ring are 86.9 (2) and 43.1 (2)°

Crossref

Directory of Open Access Journals

PubMed Central

Phase Reversal Diffraction in incoherent light

Author: De-Zhong Cao
J. W. Goodman
Jun Xiong
Kaige Wang
Shu Gan
Su-Heng Zhang
Xiangdong Zhang
Publication venue: 'American Physical Society (APS)'
Publication date: 11/08/2009
Field of study

Phase reversal occurs in the propagation of an electromagnetic wave in a negatively refracting medium or a phase-conjugate interface. Here we report the experimental observation of phase reversal diffraction without the above devices. Our experimental results and theoretical analysis demonstrate that phase reversal diffraction can be formed through the first-order field correlation of chaotic light. The experimental realization is similar to phase reversal behavior in negatively refracting media.Comment: 8 pages, 5 figure

arXiv.org e-Print Archive

Crossref

UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition

Author: Batushiren
Hao Xiang
Su Xiangdong
Wang Zhiyu
Zhang Hui
Publication venue: 'International Speech Communication Association'
Publication date: 29/10/2020
Field of study

Speech enhancement at extremely low signal-to-noise ratio (SNR) condition is a very challenging problem and rarely investigated in previous works. This paper proposes a robust speech enhancement approach (UNetGAN) based on U-Net and generative adversarial learning to deal with this problem. This approach consists of a generator network and a discriminator network, which operate directly in the time domain. The generator network adopts a U-Net like structure and employs dilated convolution in the bottleneck of it. We evaluate the performance of the UNetGAN at low SNR conditions (up to -20dB) on the public benchmark. The result demonstrates that it significantly improves the speech quality and substantially outperforms the representative deep learning models, including SEGAN, cGAN fo SE, Bidirectional LSTM using phase-sensitive spectrum approximation cost function (PSA-BLSTM) and Wave-U-Net regarding Short-Time Objective Intelligibility (STOI) and Perceptual evaluation of speech quality (PESQ).Comment: Published in Interspeech 201

arXiv.org e-Print Archive

Crossref

Synthesis and in vitro antimicrobial SAR of benzyl and phenyl guanidine and aminoguanidine hydrazone derivatives

Author: Dohle Wolfgang
Dudley Edward
Nigam Yamni
Potter Barry V. L.
Su Xiangdong
Publication venue: 'MDPI AG'
Publication date: 20/12/2022
Field of study

A series of benzyl, phenyl guanidine, and aminoguandine hydrazone derivatives was designed and in vitro antibacterial activities against two different bacterial strains (Staphylococcus aureus and Escherichia coli) were determined. Several compounds showed potent inhibitory activity against the bacterial strains evaluated, with minimal inhibitory concentration (MIC) values in the low µg/mL range. Of all guanidine derivatives, 3-[2-chloro-3-(trifluoromethyl)]-benzyloxy derivative 9m showed the best potency with MICs of 0.5 µg/mL (S. aureus) and 1 µg/mL (E. coli), respectively. Several aminoguanidine hydrazone derivatives also showed good overall activity. Compounds 10a, 10j, and 10r–s displayed MICs of 4 µg/mL against both S. aureus and E. coli. In the aminoguanidine hydrazone series, 3-(4-trifluoromethyl)-benzyloxy derivative 10d showed the best potency against S. aureus (MIC 1 µg/mL) but was far less active against E. coli (MIC 16 µg/mL). Compound 9m and the para-substituted derivative 9v also showed promising results against two strains of methicillin-resistant Staphylococcus aureus (MRSA). These results provide new and potent structural leads for further antibiotic optimisation strategies

Multidisciplinary Digital Publishing Institute

OPUS