94 research outputs found

    YATO: Yet Another deep learning based Text analysis Open toolkit

    Full text link
    We introduce YATO, an open-source, easy-to-use toolkit for text analysis with deep learning. Different from existing heavily engineered toolkits and platforms, YATO is lightweight and user-friendly for researchers from cross-disciplinary areas. Designed in a hierarchical structure, YATO supports free combinations of three types of widely used features including 1) traditional neural networks (CNN, RNN, etc.); 2) pre-trained language models (BERT, RoBERTa, ELECTRA, etc.); and 3) user-customized neural features via a simple configurable file. Benefiting from the advantages of flexibility and ease of use, YATO can facilitate fast reproduction and refinement of state-of-the-art NLP models, and promote the cross-disciplinary applications of NLP techniques. The code, examples, and documentation are publicly available at https://github.com/jiesutd/YATO. A demo video is also available at https://youtu.be/tSjjf5BzfQg

    Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models

    Full text link
    Large Language Models (LLMs) have achieved remarkable performance in objective tasks such as open-domain question answering and mathematical reasoning, which can often be solved through recalling learned factual knowledge or chain-of-thought style reasoning. However, we find that the performance of LLMs in subjective tasks is still unsatisfactory, such as metaphor recognition, dark humor detection, etc. Compared to objective tasks, subjective tasks focus more on interpretation or emotional response rather than a universally accepted reasoning pathway. Based on the characteristics of the tasks and the strong dialogue-generation capabilities of LLMs, we propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation. The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales, thereby offering potential useful knowledge behind dialogues for giving the final answers. We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks. Experimental results show that RiC can yield significant improvement compared with various baselines

    Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting

    Full text link
    We explore the ability of large language models (LLMs) to act as speech recognition post-processors that perform rescoring and error correction. Our first focus is on instruction prompting to let LLMs perform these task without fine-tuning, for which we evaluate different prompting schemes, both zero- and few-shot in-context learning, and a novel task activation prompting method that combines causal instructions and demonstration to increase its context windows. Next, we show that rescoring only by in-context learning with frozen LLMs achieves results that are competitive with rescoring by domain-tuned LMs, using a pretrained first-pass recognition system and rescoring output on two out-of-domain tasks (ATIS and WSJ). By combining prompting techniques with fine-tuning we achieve error rates below the N-best oracle level, showcasing the generalization power of the LLMs.Comment: Accepted to IEEE Automatic Speech Recognition and Understanding (ASRU) 2023. 8 pages. 2nd version revised from Sep 29th's versio

    Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks

    Full text link
    In the realm of spoken language understanding (SLU), numerous natural language understanding (NLU) methodologies have been adapted by supplying large language models (LLMs) with transcribed speech instead of conventional written text. In real-world scenarios, prior to input into an LLM, an automated speech recognition (ASR) system generates an output transcript hypothesis, where inherent errors can degrade subsequent SLU tasks. Here we introduce a method that utilizes the ASR system's lattice output instead of relying solely on the top hypothesis, aiming to encapsulate speech ambiguities and enhance SLU outcomes. Our in-context learning experiments, covering spoken question answering and intent classification, underline the LLM's resilience to noisy speech transcripts with the help of word confusion networks from lattices, bridging the SLU performance gap between using the top ASR hypothesis and an oracle upper bound. Additionally, we delve into the LLM's robustness to varying ASR performance conditions and scrutinize the aspects of in-context learning which prove the most influential.Comment: Accepted to ICASSP 202

    Psychometric properties of the Chinese version of the preoperative assessment of readiness tool among surgical patients

    Get PDF
    BackgroundThe evaluation of the surgical readiness of patients plays an important role in clinical care. Preoperative readiness assessment is needed to identify the inadequacy among surgical patients, which provides guide for interventions to improve patients’ preoperative readiness. However, there is a paucity of high-level, quality tool that evaluate surgical readiness of patients in China. The purpose of this study is to translate the Preoperative Assessment of Readiness Tool (PART) into Chinese and determine the reliability and validity of the Chinese version in the population of surgical patients.MethodsUsing a standard translation-backward method, the original English version of PART was translated into Chinese. A convenient sampling of 210 surgical patients was recruited from 6 hospitals in Zhejiang Province to test the psychometric properties of this scale including internal consistency, split-half reliability, content validity, structure validity, and floor/ceiling effect.ResultsA total of 194 patients (92%) completed questionnaires. The Chinese version of PART achieved Cronbach’s alphas 0.948 and McDonald’s omega coefficient 0.947, respectively, for the full scale. The estimated odd-even split-half reliability was 0.959. The scale-level content validity index was 0.867, and the items content validity index ranged from 0.83 to 1.0.The output of confirmatory factor analysis (CFA) revealed a two-factor model (χ2 = 510.96; df = 86; p < 0.001; root mean square error approximation = 0.08) with no floor/ceiling effect.ConclusionThe Chinese version of PART demonstrated acceptable reliability and validity among surgical patients. It can be used to evaluate patients’ preoperative preparation and help health professionals provide proper preoperative support

    Media use degree and depression: A latent profile analysis from Chinese residents

    Get PDF
    BackgroundPrevious studies have emphasized the media as an essential channel for understanding information about depression. However, they have not divided groups according to the degree of media use to study their differences in depression. Therefore, this study aims to explore the influence of media use on depression and the influencing factors of depression in people with different media use degrees.MethodsBased on seven items related to media use, a total of 11, 031 respondents were categorized by the frequency of media use using latent profile analysis (LPA). Secondly, multiple linear regression analyzes were conducted to analyze the effects of depression in people with different degrees of media use. Finally, factors influencing depression among people with different degrees of media use were explored separately.ResultsAll respondents were classified into three groups: media use low-frequency (9.7%), media use general (67.1%), and media use high-frequency (23.2%). Compared with media use general group, media use low-frequency (β = 0.019, p = 0.044) and media use high-frequency (β = 0.238, p < 0.001) groups are significantly associated with depression. The factors influencing depression in the population differed between media use low-frequency, media use general, and media use high-frequency groups.ConclusionThe government and the appropriate departments should develop targeted strategies for improving the overall health status of people with different media use degrees

    Media use degree and depression: A latent profile analysis from Chinese residents

    Get PDF
    Background: Previous studies have emphasized the media as an essential channel for understanding information about depression. However, they have not divided groups according to the degree of media use to study their differences in depression. Therefore, this study aims to explore the influence of media use on depression and the influencing factors of depression in people with different media use degrees. Methods: Based on seven items related to media use, a total of 11, 031 respondents were categorized by the frequency of media use using latent profile analysis (LPA). Secondly, multiple linear regression analyzes were conducted to analyze the effects of depression in people with different degrees of media use. Finally, factors influencing depression among people with different degrees of media use were explored separately. Results: All respondents were classified into three groups: media use low-frequency (9.7%), media use general (67.1%), and media use high-frequency (23.2%). Compared with media use general group, media use low-frequency (β = 0.019, p = 0.044) and media use high-frequency (β = 0.238, p < 0.001) groups are significantly associated with depression. The factors influencing depression in the population differed between media use low-frequency, media use general, and media use high-frequency groups. Conclusion: The government and the appropriate departments should develop targeted strategies for improving the overall health status of people with different media use degrees

    Research progress of 3D printed poly (ether ether ketone) in the reconstruction of craniomaxillofacial bone defects

    Get PDF
    The clinical challenge of bone defects in the craniomaxillofacial region, which can lead to significant physiological dysfunction and psychological distress, persists due to the complex and unique anatomy of craniomaxillofacial bones. These critical-sized defects require the use of bone grafts or substitutes for effective reconstruction. However, current biomaterials and methods have specific limitations in meeting the clinical demands for structural reinforcement, mechanical support, exceptional biological performance, and aesthetically pleasing reconstruction of the facial structure. These drawbacks have led to a growing need for novel materials and technologies. The growing development of 3D printing can offer significant advantages to address these issues, as demonstrated by the fabrication of patient-specific bioactive constructs with controlled structural design for complex bone defects in medical applications using this technology. Poly (ether ether ketone) (PEEK), among a number of materials used, is gaining recognition as a feasible substitute for a customized structure that closely resembles natural bone. It has proven to be an excellent, conformable, and 3D-printable material with the potential to replace traditional autografts and titanium implants. However, its biological inertness poses certain limitations. Therefore, this review summarizes the distinctive features of craniomaxillofacial bones and current methods for bone reconstruction, and then focuses on the increasingly applied 3D printed PEEK constructs in this field and an update on the advanced modifications for improved mechanical properties, biological performance, and antibacterial capacity. Exploring the potential of 3D printed PEEK is expected to lead to more cost-effective, biocompatible, and personalized treatment of craniomaxillofacial bone defects in clinical applications

    Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition

    Full text link
    We propose a neural language modeling system based on low-rank adaptation (LoRA) for speech recognition output rescoring. Although pretrained language models (LMs) like BERT have shown superior performance in second-pass rescoring, the high computational cost of scaling up the pretraining stage and adapting the pretrained models to specific domains limit their practical use in rescoring. Here we present a method based on low-rank decomposition to train a rescoring BERT model and adapt it to new domains using only a fraction (0.08%) of the pretrained parameters. These inserted matrices are optimized through a discriminative training objective along with a correlation-based regularization loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is evaluated on LibriSpeech and internal datasets with decreased training times by factors between 5.4 and 3.6.Comment: Accepted to IEEE ASRU 2023. Internal Review Approved. Revised 2nd version with Andreas and Huck. The first version is in Sep 29th. 8 page
    • …
    corecore