97 research outputs found

    Contextual Biasing of Named-Entities with Large Language Models

    Full text link
    This paper studies contextual biasing with Large Language Models (LLMs), where during second-pass rescoring additional contextual information is provided to a LLM to boost Automatic Speech Recognition (ASR) performance. We propose to leverage prompts for a LLM without fine tuning during rescoring which incorporate a biasing list and few-shot examples to serve as additional information when calculating the score for the hypothesis. In addition to few-shot prompt learning, we propose multi-task training of the LLM to predict both the entity class and the next token. To improve the efficiency for contextual biasing and to avoid exceeding LLMs' maximum sequence lengths, we propose dynamic prompting, where we select the most likely class using the class tag prediction, and only use entities in this class as contexts for next token prediction. Word Error Rate (WER) evaluation is performed on i) an internal calling, messaging, and dictation dataset, and ii) the SLUE-Voxpopuli dataset. Results indicate that biasing lists and few-shot examples can achieve 17.8% and 9.6% relative improvement compared to first pass ASR, and that multi-task training and dynamic prompting can achieve 20.0% and 11.3% relative WER improvement, respectively.Comment: 5 pages, 4 figures. Conference: ICASSP 202

    Co-contributorship Network and Division of Labor in Individual Scientific Collaborations

    Full text link
    Collaborations are pervasive in current science. Collaborations have been studied and encouraged in many disciplines. However, little is known how a team really functions from the detailed division of labor within. In this research, we investigate the patterns of scientific collaboration and division of labor within individual scholarly articles by analyzing their co-contributorship networks. Co-contributorship networks are constructed by performing the one-mode projection of the author-task bipartite networks obtained from 138,787 papers published in PLoS journals. Given a paper, we define three types of contributors: Specialists, Team-players, and Versatiles. Specialists are those who contribute to all their tasks alone; team-players are those who contribute to every task with other collaborators; and versatiles are those who do both. We find that team-players are the majority and they tend to contribute to the five most common tasks as expected, such as "data analysis" and "performing experiments". The specialists and versatiles are more prevalent than expected by a random-graph null model. Versatiles tend to be senior authors associated with funding and supervisions. Specialists are associated with two contrasting roles: the supervising role as team leaders or marginal and specialized contributions.Comment: accepted by JASIS

    Recovering from Privacy-Preserving Masking with Large Language Models

    Full text link
    Model adaptation is crucial to handle the discrepancy between proxy training data and actual users data received. To effectively perform adaptation, textual data of users is typically stored on servers or their local devices, where downstream natural language processing (NLP) models can be directly trained using such in-domain data. However, this might raise privacy and security concerns due to the extra risks of exposing user information to adversaries. Replacing identifying information in textual data with a generic marker has been recently explored. In this work, we leverage large language models (LLMs) to suggest substitutes of masked tokens and have their effectiveness evaluated on downstream language modeling tasks. Specifically, we propose multiple pre-trained and fine-tuned LLM-based approaches and perform empirical studies on various datasets for the comparison of these methods. Experimental results show that models trained on the obfuscation corpora are able to achieve comparable performance with the ones trained on the original data without privacy-preserving token masking.Comment: Submitted to ICASS

    Breastmilk microbiome changes associated with lactational mastitis and treatment with dandelion extract

    Get PDF
    IntroductionDandelion (Pugongying) is one of the most frequently used Chinese herbs for treating lactational mastitis (LM). Pugongying granules, a patented medication primarily comprised of dandelion extract, have been approved by CFDA for LM treatment in China. The aims of this study were to investigate the etiology of LM and the mechanism by which Pugongying granules decrease LM symptoms, with a particular focus on the microbial communities found in breastmilk.MethodsParticipants were recruited from a previously performed randomized controlled trial (Identifier: NCT03756324, ClinicalTrials.gov). Between 2019 and 2020, women diagnosed with unilateral LM at the Beijing University of Chinese Medicine Third Affiliated Hospital were enrolled. In total, 42 paired breastmilk samples from the healthy and affected breasts of the participants were collected. Additionally, 37 paired pre- and post-treatment breastmilk samples from the affected breast were collected from women who received a 3-day course of either Pugongying granules (20 women) or cefdinir (17 women). Clinical outcomes [e.g., body temperature, visual analogue scale (VAS) score for breast pain, the percentage of neutrophils (NE%)] were analyzed pre- and post-treatment, and the breastmilk samples were subjected to 16S rRNA gene sequencing to analyze the alpha and beta diversities and identify significant bacteria. Finally, the relationship between microorganisms and clinical outcomes was analyzed.ResultsThere was no significant difference in fever and pain between the Pugongying group and cefdinir group. The most prevalent bacterial genera in breastmilk were Streptococcus and Staphylococcus. Compared to healthy breastmilk, microbial diversity was reduced in affected breastmilk, and there was a higher relative abundance of Streptococcus. After Pugongying treatment, there was an increase in microbial diversity with significantly higher abundance of Corynebacterium. A negative correlation was found between Corynebacterium, VAS score, and NE%. Treatment with cefdinir did not affect microbial diversity. Taken together, our results show a correlation between LM and reduced microbial diversity, as well as an increased abundance of Streptococcus in affected breastmilk.ConclusionPugongying granules enhanced microbial diversity in breastmilk samples. Given the substantial variation in individual microbiomes, identifying specific species of Streptococcus and Corynebacterium associated with LM may provide additional insight into LM pathogenesis and treatment

    Single-cell-resolution transcriptome map revealed novel genes involved in testicular germ cell progression and somatic cells specification in Chinese tongue sole with sex reversal

    Get PDF
    19 pages, 7 figures, supporting information https://doi.org/10.1007/s11427-021-2236-4.-- Data availability: The data reported in this study are available in the CNGB Nucleotide Sequence Archive (CNSA: https://db.cngb.org/cnsa; accession number CNP0002135).Female-to-male sex reversals (pseudomales) are common in lower vertebrates and have been found in natural populations, which is a concern under rapid changes in environmental conditions. Pseudomales can exhibit altered spermatogenesis. However, the regulatory mechanisms underlying pseudomale spermatogenesis remain unclear. Here, we characterized spermatogenesis in Chinese tongue sole (Cynoglossus semilaevis), a species with genetic and environmental sex determination, based on a high-resolution single-cell RNA-seq atlas of cells derived from the testes of genotypic males and pseudomales. We identified five germ cell types and six somatic cell types and obtained a single-cell atlas of dynamic changes in gene expression during spermatogenesis in Chinese tongue sole, including alterations in pseudomales. We detected decreased levels of Ca2+ signaling pathway-related genes in spermatogonia, insufficient meiotic initiation in spermatocytes, and a malfunction of somatic niche cells in pseudomales. However, a cluster of CaSR genes and MAPK signaling factors were upregulated in undifferentiated spermatogonia of pseudomales. Additionally, we revealed that Z chromosome-specific genes, such as piwil2, dhx37, and ehmt1, were important for spermatogenesis. These results improve our understanding of reproduction after female-to-male sex-reversal and provide new insights into the adaptability of reproductive strategies in lower vertebratesThis work was supported by the National Key R&D Program of China (2018YFD0900301), the National Nature Science Foundation of China (31722058, 31802275, 31472269), the AoShan Talents Cultivation Program Supported by Qingdao National Laboratory for Marine Science and Technology (2017ASTCP-ES06), the Taishan Scholar Project Fund of Shandong of China to C.S., the National Ten-Thousands Talents Special Support Program to C.S., the Central Public-interest Scientific Institution Basal Research Fund, CAFS (2020TD19) and the China Agriculture Research System (CARS-47-G03)With the institutional support of the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S)Peer reviewe

    Making Sense of Institutional Change in China: The Cultural Dimension of Economic Growth and Modernization

    Full text link
    corecore