24 research outputs found

    Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis

    Full text link
    The spontaneous behavior that often occurs in conversations makes speech more human-like compared to reading-style. However, synthesizing spontaneous-style speech is challenging due to the lack of high-quality spontaneous datasets and the high cost of labeling spontaneous behavior. In this paper, we propose a semi-supervised pre-training method to increase the amount of spontaneous-style speech and spontaneous behavioral labels. In the process of semi-supervised learning, both text and speech information are considered for detecting spontaneous behaviors labels in speech. Moreover, a linguistic-aware encoder is used to model the relationship between each sentence in the conversation. Experimental results indicate that our proposed method achieves superior expressive speech synthesis performance with the ability to model spontaneous behavior in spontaneous-style speech and predict reasonable spontaneous behavior from text.Comment: Accepted by INTERSPEECH 202

    CB-Conformer: Contextual biasing Conformer for biased word recognition

    Full text link
    Due to the mismatch between the source and target domains, how to better utilize the biased word information to improve the performance of the automatic speech recognition model in the target domain becomes a hot research topic. Previous approaches either decode with a fixed external language model or introduce a sizeable biasing module, which leads to poor adaptability and slow inference. In this work, we propose CB-Conformer to improve biased word recognition by introducing the Contextual Biasing Module and the Self-Adaptive Language Model to vanilla Conformer. The Contextual Biasing Module combines audio fragments and contextual information, with only 0.2% model parameters of the original Conformer. The Self-Adaptive Language Model modifies the internal weights of biased words based on their recall and precision, resulting in a greater focus on biased words and more successful integration with the automatic speech recognition model than the standard fixed language model. In addition, we construct and release an open-source Mandarin biased-word dataset based on WenetSpeech. Experiments indicate that our proposed method brings a 15.34% character error rate reduction, a 14.13% biased word recall increase, and a 6.80% biased word F1-score increase compared with the base Conformer

    Multispectral Imaging for Microchip Electrophoresis Enables Point-of-Care Newborn Hemoglobin Variant Screening

    Get PDF
    Hemoglobin (Hb) disorders affect nearly 7% of the world\u27s population. Globally, around 400,000 babies are born annually with sickle cell disease (SCD), primarily in sub-Saharan Africa where morbidity and mortality rates are high. Screening, early diagnosis, and monitoring are not widely accessible due to technical challenges and cost. We hypothesized that multispectral imaging will allow sensitive hemoglobin variant identification in existing affordable paper-based Hb electrophoresis. To test this hypothesis, we developed the first integrated point-of-care multispectral Hb variant test: Gazelle-Multispectral. Here, we evaluated the accuracy of Gazelle-Multispectral for Hb variant newborn screening in 265 newborns with known hemoglobin variants including hemoglobin A (Hb A), hemoglobin F (Hb F), hemoglobin S (Hb S) and hemoglobin C (Hb C). Gazelle-Multispectral detected levels of Hb A, Hb F, Hb S, and Hb C/E/A2, demonstrated high correlations with the results reported by laboratory gold standard high performance liquid chromatography (HPLC) at Pearson Correlation Coefficient = 0.97, 0.97, 0.93, and 0.95. Gazelle-Multispectral demonstrated accuracy of 96.8% in subjects of 0–3 days, and 96.9% in newborns. The ability to obtain accurate results on newborn samples suggest that Gazelle-Multispectral can be suitable for large-scale newborn screening and for diagnosis of SCD in low resource settings

    UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons

    Full text link
    The automatic co-speech gesture generation draws much attention in computer animation. Previous works designed network structures on individual datasets, which resulted in a lack of data volume and generalizability across different motion capture standards. In addition, it is a challenging task due to the weak correlation between speech and gestures. To address these problems, we present UnifiedGesture, a novel diffusion model-based speech-driven gesture synthesis approach, trained on multiple gesture datasets with different skeletons. Specifically, we first present a retargeting network to learn latent homeomorphic graphs for different motion capture standards, unifying the representations of various gestures while extending the dataset. We then capture the correlation between speech and gestures based on a diffusion model architecture using cross-local attention and self-attention to generate better speech-matched and realistic gestures. To further align speech and gesture and increase diversity, we incorporate reinforcement learning on the discrete gesture units with a learned reward function. Extensive experiments show that UnifiedGesture outperforms recent approaches on speech-driven gesture generation in terms of CCA, FGD, and human-likeness. All code, pre-trained models, databases, and demos are available to the public at https://github.com/YoungSeng/UnifiedGesture.Comment: 16 pages, 11 figures, ACM MM 202

    Risk factors associated with amyotrophic lateral sclerosis based on the observational study: a systematic review and meta-analysis

    Get PDF
    ObjectiveAmyotrophic lateral sclerosis (ALS) is a fatal neurodegenerative disorder affecting the upper and lower motor neurons. Though the pathogenesis of ALS is still unclear, exploring the associations between risk factors and ALS can provide reliable evidence to find the pathogenesis. This meta-analysis aims to synthesize all related risk factors of ALS to understand this disease comprehensively.MethodsWe searched the following databases: PubMed, EMBASE, Cochrane library, Web of Science, and Scopus. Moreover, observational studies, including cohort studies, and case-control studies, were included in this meta-analysis.ResultsA total of 36 eligible observational studies were included, and 10 of them were cohort studies and the rest were case-control studies. We found six factors exacerbated the progression of disease: head trauma (OR = 1.26, 95% CI = 1.13, 1.40), physical activity (OR = 1.06, 95% CI = 1.04, 1.09), electric shock (OR = 2.72, 95% CI = 1.62, 4.56), military service (OR = 1.34, 95% CI = 1.11, 1.61), pesticides (OR = 1.96, 95% CI = 1.7, 2.26), and lead exposure (OR = 2.31, 95% CI = 1.44, 3.71). Of note, type 2 diabetes mellitus was a protective factor for ALS. However, cerebrovascular disease (OR = 0.99, 95% CI = 0.75, 1.29), agriculture (OR = 1.22, 95% CI = 0.74, 1.99), industry (OR = 1.24, 95% CI = 0.81, 1.91), service (OR = 0.47, 95% CI = 0.19, 1.17), smoking (OR = 1.25, 95% CI = 0.5, 3.09), chemicals (OR = 2.45, 95% CI = 0.89, 6.77), and heavy metal (OR = 1.5, 95% CI = 0.47, 4.84) were not risk factors for ALS based on meta-analyses.ConclusionsHead trauma, physical activity, electric shock, military service, pesticides, and lead were risk factors for ALS onset and progression. But DM was a protective factor. This finding provides a better understanding of ALS risk factors with strong evidence for clinicians to rationalize clinical intervention strategies.INPLSY registration numberhttps://inplasy.com/inplasy-2022-9-0118/, INPLASY202290118

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    Comprehensive Comparisons between Grafted Kynam Agarwood and Normal Agarwood on Traits, Composition, and In Vitro Activation of AMPK

    No full text
    Agarwood, a highly valuable resin/wood combination with diverse pharmacological activities but scarce supply, has a long history of being used as a medicine in several medical systems. Grafted Kynam agarwood (GKA) has been cultivated successfully recently and has the qualities meeting the definition of premium Kynam agarwood. However, there are few comprehensive comparisons between GKA and normal agarwood in terms of traits, global composition, and activity, and some key issues for GKA to be adopted into the traditional Chinese medical (TCM) system have not been elaborated. The two types of agarwood samples were evaluated in terms of trait characteristics, physicochemical indicators, key component groups, and global compositional profile. Furthermore, a molecular docking was performed to investigate the active ingredients. In vitro activity assays were performed to evaluate the activation of adenosine 5’-monophosphate (AMP)-activated protein kinase (AMPK) by GKA and normal agarwood. The results revealed that, overall, the traits, microscopic characteristics, chemical composition types, and bioactivity between GKA and normal agarwood were similar. The main differences were the content of resin (ethanolic extract content), the content of key component groups, and the composition of the different parent structural groups of 2-(2-phenethyl) chromones (PECs). The contents of total PEC and ethanol extract content of GKA were significantly higher than those of normal agarwood. The MS-based high-throughput analysis revealed that GKA has higher concentrations of sesquiterpenes and flindersia-type 2-(2-phenylethyl) chromones (FTPECs) (m/z 250-312) than normal agarwood. Molecular docking revealed that parent structural groups of FTPECs activated multiple signaling pathways, including the AMPK pathway, suggesting that FTPECs are major active components in GKA. The aim of this paper is to describe the intrinsic reasons for GKA as a high-quality agarwood and a potential source for novel drug development. We combined high-throughput mass spectrometry and multivariate statistical analysis to infer the different components of the two types of agarwood. Then we combined virtual screening and in vitro activity to construct a component/pharmacodynamic relationship to explore the causes of the activity differences between agarwood with different levels of quality and to identify potentially valuable lead compounds. This strategy can also be used for the comprehensive study of other TCMs with different qualities

    Characteristics of Magnetic Fields Induced by the Wake of an Underwater Vehicle

    No full text
    Underwater vehicles generate hydrodynamic wakes within a large area that last for a longtime during navigation, thus generating induced magnetic fields, and these are of great significance for detecting and tracking underwater vehicles. In combination with the wakefield and magnetic field simulations, this study adopts the dynamic overlapping mesh technology to conduct a numerical simulation of the wake magnetic field during the movement of an underwater vehicle. This paper introduces the causes of formation and laws of evolution of the wake magnetic field, analyzes its spatial distribution and time-domain changes, and discusses the time-frequency domain characteristics at different monitoring points as well as the effects of navigation speed and acceleration on wake magnetic fields. Our results indicate that the wake magnetic field of an underwater vehicle belongs to a low-frequency weak signal of 0–5 Hz; as the navigation speed increases, the barycenter frequency of the wake magnetic field decreases and the half-energy bandwidth increases. The increase in acceleration of the underwater vehicle will cause a higher growth rate of the wake magnetic field. This paper provides a theoretical reference for the detection of underwater vehicles based on wake magnetic fields
    corecore