21 research outputs found
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Convolution-augmented transformers (Conformers) are recently proposed in
various speech-domain applications, such as automatic speech recognition (ASR)
and speech separation, as they can capture both local and global dependencies.
In this paper, we propose a conformer-based metric generative adversarial
network (CMGAN) for speech enhancement (SE) in the time-frequency (TF) domain.
The generator encodes the magnitude and complex spectrogram information using
two-stage conformer blocks to model both time and frequency dependencies. The
decoder then decouples the estimation into a magnitude mask decoder branch to
filter out unwanted distortions and a complex refinement branch to further
improve the magnitude estimation and implicitly enhance the phase information.
Additionally, we include a metric discriminator to alleviate metric mismatch by
optimizing the generator with respect to a corresponding evaluation score.
Objective and subjective evaluations illustrate that CMGAN is able to show
superior performance compared to state-of-the-art methods in three speech
enhancement tasks (denoising, dereverberation and super-resolution). For
instance, quantitative denoising analysis on Voice Bank+DEMAND dataset
indicates that CMGAN outperforms various previous models with a margin, i.e.,
PESQ of 3.41 and SSNR of 11.10 dB.Comment: 16 pages, 10 figures and 5 tables. arXiv admin note: text overlap
with arXiv:2203.1514
CMGAN: Conformer-based Metric GAN for Speech Enhancement
Recently, convolution-augmented transformer (Conformer) has achieved
promising performance in automatic speech recognition (ASR) and time-domain
speech enhancement (SE), as it can capture both local and global dependencies
in the speech signal. In this paper, we propose a conformer-based metric
generative adversarial network (CMGAN) for SE in the time-frequency (TF)
domain. In the generator, we utilize two-stage conformer blocks to aggregate
all magnitude and complex spectrogram information by modeling both time and
frequency dependencies. The estimation of magnitude and complex spectrogram is
decoupled in the decoder stage and then jointly incorporated to reconstruct the
enhanced speech. In addition, a metric discriminator is employed to further
improve the quality of the enhanced estimated speech by optimizing the
generator with respect to a corresponding evaluation score. Quantitative
analysis on Voice Bank+DEMAND dataset indicates the capability of CMGAN in
outperforming various previous models with a margin, i.e., PESQ of 3.41 and
SSNR of 11.10 dB.Comment: 5 pages, 1 figure, 2 tables, submitted to INTERSPEECH 202
Mitogenome characterization and diversity of the nangqian grey yak (bos grunniens)
Nangqian grey yak (Bos grunniens) is a unique yak population in Qinghai Province, China.
In this study, the whole mitogenome sequences of 18 Nangqian grey yaks were sequenced
based on the next-generation sequencing (NGS) technology and annotated. The total
length of whole mitogenome sequence is between 16.323 bp and 16.325 bp, including
a non-coding control region (D-loop region), 22 tRNA genes, 13 protein-coding genes
and two rRNA genes (12S rRNA and 16S rRNA). Maternal genetic diversity based on the
mitogenome variations was analyzed. A total of 12 haplotypes were identified among
18 complete mitogenome sequences, the haplotype diversity and nucleotide diversity
of Nangqian grey yak were 0.948±0.033 and 0.001±0.001, respectively. Compared with
the wild yak population and six other domestic yak breeds/populations in China, the
haplotype diversity of Nangqian grey yak population was higher, indicating abundant
maternal genetic diversity in Nangqian grey yak. The phylogenetic tree showed that
Nangqian grey yak was most closely related to Tibet alpine, Xueduo, Changtai, Sibu,
Zhongdian, Tianzhu white, Ashdan, Jinchuan, Jiulong, Pamir, Pali, Qinghai plateau,
Huanhu, Datong, Bazhou and wild yak breeds/populations, closer to Chawula, Muli,
Gannan, Niangya and Yushu yak breeds, but far away from other yak breeds (i.e. Leiwuqi
and Maiwa yak)
Investigation of Very Large Eddy Simulation Method for Applications of Supersonic Turbulent Combustion
The very large eddy simulation (VLES) method was investigated for supersonic reacting flows in the present work. The advantages and characteristics of the VLES model and the widely used improved delayed detached eddy simulation (IDDES) method were revealed through a supersonic ramped-cavity cold flow. Compared to the IDDES model, the VLES model transformed from RANS mode to LES mode faster, resulting in a smaller gray region caused by the mode transition. However, the original volume-averaging truncation length scale could lead to poor predictions of the velocity profiles and wall pressure distribution. By introducing a hybrid truncation length scale combining the maximum grid length and the shear layer adaptive (SLA) length with different coefficients, the accuracy of the VLES method was significantly improved, and the issue of the low shear layer position was solved. Moreover, owing to the resolution control function, the VLES method could adaptively model more turbulent kinetic energy and maintain a good accuracy in a coarser mesh. Finally, the modified VLES method was applied in conjunction with a hybrid combustion model constructed by the partially stirred reactor (PaSR) model and the Ingenito supersonic combustion model (ISCM) in simulations of the supersonic flame in the DLR scramjet combustor. After introducing the correction of the molecular collision frequency by the ISCM, the results obtained by the hybrid combustion model were more consistent with the experimental results, especially for the time-averaging temperature profile in the ignition zone
CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement
Convolution-augmented transformers (Conformers) are recently proposed in various speech-domain applications, such as automatic speech recognition (ASR) and speech separation, as they can capture both local and global dependencies. In this paper, we propose a conformer-based metric generative adversarial network (CMGAN) for speech enhancement (SE) in the time-frequency (TF) domain. The generator encodes the magnitude and complex spectrogram information using two-stage conformer blocks to model both time and frequency dependencies. The decoder then decouples the estimation into a magnitude mask decoder branch to filter out unwanted distortions and a complex refinement branch to further improve the magnitude estimation and implicitly enhance the phase information. Additionally, we include a metric discriminator to alleviate metric mismatch by optimizing the generator with respect to a corresponding evaluation score. Objective and subjective evaluations illustrate that CMGAN is able to show superior performance compared to state-of-the-art methods in three speech enhancement tasks (denoising, dereverberation and super-resolution). For instance, quantitative denoising analysis on Voice Bank+DEMAND dataset indicates that CMGAN outperforms various previous models with a margin, i.e., PESQ of 3.41 and SSNR of 11.10 dB. </p
Power Balance Strategies in Steady-State Simulation of the Micro Gas Turbine Engine by Component-Coupled 3D CFD Method
Currently, an increasing number of designers have begun to pay attention to a new paradigm for evaluating the performance with full engine 3-dimensional computational fluid dynamics (3D CFD) simulations. Compared with the traditional component-based performance simulation method component-based performance simulation method (‘component-matched’ method), this novel ‘component-coupled’ method can evaluate the overall performance of the engine more physically and obtain more detailed flow field parameters simultaneously. Importantly, the power balance iteration should be introduced to the novel method to satisfy the constraints of the coaxial components for the gas turbine engine at steady state. By carrying out the ‘component-matched’ simulation and the ‘component-coupled’ simulation for a micro turbojet engine, the necessity of introducing the power balance iteration was discussed in this paper. The influence of steady-state co-working constraints on the engine performance was analysed and strategies for power balance iteration were proposed. To verify the capability and feasibility of this method, not only the co-working state but also the windmill state of the gas turbine engine were simulated by using the 3D CFD method considering power balance iteration. The results show that the power balance strategy proposed in this paper can converge the aerodynamic parameters as well as the power residual in a robust way
Recommended from our members
Liquid Metal Composites-Enabled Real-Time Hand Gesture Recognizer with Superior Recognition Speed and Accuracy.
Publication status: PublishedProsthetic hands play a vital role in restoring forearm functionality for patients who have suffered hand loss or deformity. The hand gesture intention recognition system serves as a critical component within the prosthetic hand system. However, accurately and swiftly identifying hand gesture intentions remains a challenge in existing approaches. Here, a real-time motion intention recognition system utilizing liquid metal composite sensor bracelets is proposed. The sensor bracelet detects pressure signals generated by forearm muscle movements to recognize hand gesture intent. Leveraging the remarkable pressure sensitivity of liquid metal composites and the efficient classifier based on the optimized recognition algorithm, this system achieves an average offline and real-time recognition accuracy of 98.2% and 92.04%, respectively, with an average recognition speed of 0.364 s. Thus, this wearable system shows advantages in superior recognition speed and accuracy. Furthermore, this system finds applications in master-slave control of prosthetic hands in unmanned scenarios, such as electrically powered operations, space exploration, and telemedicine. The proposed system promises significant advances in next-generation intent-controlled prosthetic hands and robots
High-Altitude Stress Orchestrates mRNA Expression and Alternative Splicing of Ovarian Follicle Development Genes in Tibetan Sheep
High-altitude stress threatens the survival rate of Tibetan sheep and reduces their fertility. However, the molecular basis of this phenomenon remains elusive. Here, we used RNA-seq to elucidate the transcriptome dynamics of high-altitude stress in Tibetan sheep ovaries. In total, 104 genes were characterized as high-altitude stress-related differentially expressed genes (DEGs). In addition, 36 DEGs contributed to ovarian follicle development, and 28 of them were downregulated under high-altitude stress. In particular, high-altitude stress significantly suppressed the expression of two ovarian lymphatic system marker genes: LYVE1 and ADAMTS-1. Network analysis revealed that luteinizing hormone (LH)/follicle-stimulating hormone (FSH) signaling-related genes, such as EGR1, FKBP5, DUSP1, and FOS, were central regulators in the DEG network, and these genes were also suppressed under high-altitude stress. As a post-transcriptional regulation mechanism, alternative splicing (AS) is ubiquitous in Tibetan sheep. High-altitude stress induced 917 differentially alternative splicing (DAS) events. High-altitude stress modulated DAS in an AS-type-specific manner: suppressing skipped exon events but increasing retained intron events. C2H2-type zinc finger transcription factors and RNA processing factors were mainly enriched in DAS. These findings revealed high-altitude stress repressed ovarian development by suppressing the gene expression of LH/FSH hormone signaling genes and inducing intron retention of C2H2-type zinc finger transcription factors