59 research outputs found

    Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

    Full text link
    This paper presents a streaming speaker-attributed automatic speech recognition (SA-ASR) model that can recognize "who spoke what" with low latency even when multiple people are speaking simultaneously. Our model is based on token-level serialized output training (t-SOT) which was recently proposed to transcribe multi-talker speech in a streaming fashion. To further recognize speaker identities, we propose an encoder-decoder based speaker embedding extractor that can estimate a speaker representation for each recognized token not only from non-overlapping speech but also from overlapping speech. The proposed speaker embedding, named t-vector, is extracted synchronously with the t-SOT ASR model, enabling joint execution of speaker identification (SID) or speaker diarization (SD) with the multi-talker transcription with low latency. We evaluate the proposed model for a joint task of ASR and SID/SD by using LibriSpeechMix and LibriCSS corpora. The proposed model achieves substantially better accuracy than a prior streaming model and shows comparable or sometimes even superior results to the state-of-the-art offline SA-ASR model.Comment: Submitted to Interspeech 202

    Critical role of the gut microbiota in immune responses and cancer immunotherapy

    Get PDF
    The gut microbiota plays a critical role in the progression of human diseases, especially cancer. In recent decades, there has been accumulating evidence of the connections between the gut microbiota and cancer immunotherapy. Therefore, understanding the functional role of the gut microbiota in regulating immune responses to cancer immunotherapy is crucial for developing precision medicine. In this review, we extract insights from state-of-the-art research to decipher the complicated crosstalk among the gut microbiota, the systemic immune system, and immunotherapy in the context of cancer. Additionally, as the gut microbiota can account for immune-related adverse events, we discuss potential interventions to minimize these adverse effects and discuss the clinical application of five microbiota-targeted strategies that precisely increase the efficacy of cancer immunotherapy. Finally, as the gut microbiota holds promising potential as a target for precision cancer immunotherapeutics, we summarize current challenges and provide a general outlook on future directions in this field

    Dietary patterns and the risk of tuberculosis-drug-induced liver injury: a cohort study

    Get PDF
    Background and purposeNutrition is associated with tuberculosis drug-induced liver injury (TBLI). How dietary patterns relate to tuberculosis drug-induced liver injury is still unknown. The objective of this study is to explore the relation between dietary patterns and the risk of tuberculosis drug-induced liver injury.MethodsThis cohort study was conducted at two hospitals in Shandong Province, China, between 2011 and 2013. A total of 605 tuberculosis patients were included in the final analysis. The blood aspartate aminotransferase or alanine aminotransferase level was monitored through the 6-month tuberculosis treatment. The semi-quantitative food frequency questionnaires were used to survey dietary intake in the second month of the tuberculosis treatment. The China Healthy Diet Index (CHDI), which was previously validated in the Chinese population, was used as an a priori dietary pattern. A posteriori dietary patterns were extracted by principal component analysis (PCA).ResultsThe CHDI was negatively associated with the risk of liver injury [adjusted odds ratio (aOR) per standard deviation (SD) (95% CI): 0.61 (0.40–0.94)] and liver dysfunction [aOR per SD (95% CI): 0.47 (0.35–0.64)] in the multivariate logistic model. A positive association between “Organ meat, poultry, and vegetable oil” dietary pattern scores (extracted by PCA) and the risk of liver injury [aOR (95% CI): 3.02 (1.42–6.41)] and liver dysfunction [aOR (95% CI): 1.83 (1.09–3.05)] was observed.ConclusionIn conclusion, a high CHDI score was a protective factor for tuberculosis drug-induced liver injury, while the “Organ meat, poultry, and vegetable oil” dietary pattern, which was rich in organ meat, poultry, and vegetable oil and low in vegetables, was an independent risk factor for tuberculosis drug-induced liver injury

    WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

    Full text link
    Self-supervised learning (SSL) achieves great success in speech recognition, while limited exploration has been attempted for other speech processing tasks. As speech signal contains multi-faceted information including speaker identity, paralinguistics, spoken content, etc., learning universal representations for all speech tasks is challenging. To tackle the problem, we propose a new pre-trained model, WavLM, to solve full-stack downstream speech tasks. WavLM jointly learns masked speech prediction and denoising in pre-training. By this means, WavLM does not only keep the speech content modeling capability by the masked speech prediction, but also improves the potential to non-ASR tasks by the speech denoising. In addition, WavLM employs gated relative position bias for the Transformer structure to better capture the sequence ordering of input speech. We also scale up the training dataset from 60k hours to 94k hours. WavLM Large achieves state-of-the-art performance on the SUPERB benchmark, and brings significant improvements for various speech processing tasks on their representative benchmarks. The code and pre-trained models are available at https://aka.ms/wavlm.Comment: Submitted to the Journal of Selected Topics in Signal Processing (JSTSP
    • …
    corecore