450 research outputs found

    Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation

    Full text link
    Audio-visual speech enhancement (AV-SE) aims to enhance degraded speech along with extra visual information such as lip videos, and has been shown to be more effective than audio-only speech enhancement. This paper proposes further incorporating ultrasound tongue images to improve lip-based AV-SE systems' performance. Knowledge distillation is employed at the training stage to address the challenge of acquiring ultrasound tongue images during inference, enabling an audio-lip speech enhancement student model to learn from a pre-trained audio-lip-tongue speech enhancement teacher model. Experimental results demonstrate significant improvements in the quality and intelligibility of the speech enhanced by the proposed method compared to the traditional audio-lip speech enhancement baselines. Further analysis using phone error rates (PER) of automatic speech recognition (ASR) shows that palatal and velar consonants benefit most from the introduction of ultrasound tongue images.Comment: To be published in InterSpeech 202

    Ethyl 5-methyl-4-oxo-3-phenyl-2-propyl­amino-3,4-dihydro­thieno[2,3-d]pyrimidine-6-carboxyl­ate

    Get PDF
    The title compound, C19H21N3O3S, was synthesized via the aza-Wittig reaction of functionalized imino­phospho­rane with phenyl isocyanate under mild conditions. In the mol­ecule, the fused thienopyrimidine ring system is essentially planar, with a maximum deviation of 0.072 (2) Å, and makes a dihedral angle of 60.11 (9)° with the phenyl ring. An intra­molecular C—H⋯O hydrogen bond is present. The crystal packing is stabilized by inter­molecular N—H⋯O and C—H⋯O hydrogen bonds

    Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement

    Full text link
    Audio-visual speech enhancement (AV-SE) aims to enhance degraded speech along with extra visual information such as lip videos, and has been shown to be more effective than audio-only speech enhancement. This paper proposes the incorporation of ultrasound tongue images to improve the performance of lip-based AV-SE systems further. To address the challenge of acquiring ultrasound tongue images during inference, we first propose to employ knowledge distillation during training to investigate the feasibility of leveraging tongue-related information without directly inputting ultrasound tongue images. Specifically, we guide an audio-lip speech enhancement student model to learn from a pre-trained audio-lip-tongue speech enhancement teacher model, thus transferring tongue-related knowledge. To better model the alignment between the lip and tongue modalities, we further propose the introduction of a lip-tongue key-value memory network into the AV-SE model. This network enables the retrieval of tongue features based on readily available lip features, thereby assisting the subsequent speech enhancement task. Experimental results demonstrate that both methods significantly improve the quality and intelligibility of the enhanced speech compared to traditional lip-based AV-SE baselines. Moreover, both proposed methods exhibit strong generalization performance on unseen speakers and in the presence of unseen noises. Furthermore, phone error rate (PER) analysis of automatic speech recognition (ASR) reveals that while all phonemes benefit from introducing ultrasound tongue images, palatal and velar consonants benefit most.Comment: Submmited to IEEE/ACM Transactions on Audio, Speech and Language Processing. arXiv admin note: text overlap with arXiv:2305.1493

    3-Isopropyl-2-(4-methoxy­phen­oxy)-1-benzo­furo[3,2-d]pyrimidin-4(3H)-one

    Get PDF
    In the title compound, C20H18N2O4, all non-H atoms of the three fused rings of the benzofuro[3,2-d]pyrimidine system are almost coplanar (r.m.s. deviation 0.021 Å). The dihedral angle between the fused ring system and the benzene ring is 1.47 (12)°. Intra­molecular and inter­molecular C—H⋯O hydrogen bonds together with weak C—H⋯π inter­actions stabilize the structure

    Ethyl 2-isopropyl­amino-5-methyl-4-oxo-3-phenyl-3,4-dihydro­thieno[2,3-d]pyrimidine-6-carboxyl­ate

    Get PDF
    The title compound, C19H21N3O3S, was synthesized via an aza-Wittig reaction of a functionalized imino­phospho­rane with phenyl isocyanate under mild conditions. In the mol­ecule, the fused thienopyrimidine ring system makes a dihedral angle of 66.30 (11)° with the phenyl ring. An intra­molecular C—H⋯O hydrogen bond occurs. The terminal –OCH2CH3 group is disordered over two sites with refined occupancies of 0.537 (13) and 0.463 (13). The crystal packing is stabilized by inter­molecular C—H⋯O and N—H⋯O hydrogen bonds

    Ethyl 1-(6-chloro-3-pyridylmeth­yl)-5-ethoxy­methyl­eneamino-1H-1,2,3-triazole-4-carboxyl­ate

    Get PDF
    In the title compound, C14H16ClN5O3, there is evidence for significant electron delocalization in the triazolyl system. Intra­molecular C—H⋯O and inter­molecular C—H⋯O and C—H⋯N hydrogen bonds stabilize the structure

    Ethyl 5-methyl-4-oxo-3-phenyl-2-propylamino-3,4-dihydrothieno[2,3- d

    Full text link

    Diagnosis and microecological characteristics of aerobic vaginitis in outpatients based on preformed enzymes

    Get PDF
    AbstractObjectiveAerobic vaginitis (AV) is a recently proposed term for genital tract infection in women. The diagnosis of AV is mainly based on descriptive diagnostic criteria proposed by Donders and co-workers. The objective of this study is to report AV prevalence in southwest China using an objective assay kit based on preformed enzymes and also to determine its characteristics.Materials and methodsA total of 1948 outpatients were enrolled and tested by a commercial diagnostic kit to investigate the AV prevalence and characteristics in southwestern China. The study mainly examined the vaginal ecosystem, age distribution, Lactobacillus amount, and changes in pH. Differences within groups were analyzed by Wilcoxon two-sample test.ResultsThe AV detection rate is 15.40%. The AV patients were usually seen in the sexually active age group of 20–30 years, followed by those in the age group of 30–40 years. The vaginal ecosystems of all the patients studied were absolutely abnormal, and diagnosed to have a combined infection [aerobic vaginitis (AV) + bacterial vaginitis (BV) 61.33%; 184/300]. Aerobic bacteria, especially Staphylococcus aureus and Escherichia coli, were predominantly found in the vaginal samples of these women.ConclusionAV is a common type of genital infection in southwestern China and is characterized by sexually active age and combined infection predominated by the AV and BV type

    Ethyl 1-[(2-chloro-1,3-thia­zol-5-yl)methyl]-5-methyl-1H-1,2,3-triazole-4-carboxylate

    Get PDF
    In the title compound, C10H11ClN4O2S, the triazole ring carries methyl and ethoxy­carbonyl groups and is bound via a methyl­ene bridge to a chloro­thia­zole unit. There is also evidence for significant electron delocalization in the triazolyl system. Intra- and inter­molecular C—H⋯O hydrogen bonds together with strong π–π stacking inter­actions [centroid–centroid distance 3.620 (1) Å] stabilize the structure
    corecore