94 research outputs found
YATO: Yet Another deep learning based Text analysis Open toolkit
We introduce YATO, an open-source, easy-to-use toolkit for text analysis with
deep learning. Different from existing heavily engineered toolkits and
platforms, YATO is lightweight and user-friendly for researchers from
cross-disciplinary areas. Designed in a hierarchical structure, YATO supports
free combinations of three types of widely used features including 1)
traditional neural networks (CNN, RNN, etc.); 2) pre-trained language models
(BERT, RoBERTa, ELECTRA, etc.); and 3) user-customized neural features via a
simple configurable file. Benefiting from the advantages of flexibility and
ease of use, YATO can facilitate fast reproduction and refinement of
state-of-the-art NLP models, and promote the cross-disciplinary applications of
NLP techniques. The code, examples, and documentation are publicly available at
https://github.com/jiesutd/YATO. A demo video is also available at
https://youtu.be/tSjjf5BzfQg
Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
Large Language Models (LLMs) have achieved remarkable performance in
objective tasks such as open-domain question answering and mathematical
reasoning, which can often be solved through recalling learned factual
knowledge or chain-of-thought style reasoning. However, we find that the
performance of LLMs in subjective tasks is still unsatisfactory, such as
metaphor recognition, dark humor detection, etc. Compared to objective tasks,
subjective tasks focus more on interpretation or emotional response rather than
a universally accepted reasoning pathway. Based on the characteristics of the
tasks and the strong dialogue-generation capabilities of LLMs, we propose RiC
(Reasoning in Conversation), a method that focuses on solving subjective tasks
through dialogue simulation. The motivation of RiC is to mine useful contextual
information by simulating dialogues instead of supplying chain-of-thought style
rationales, thereby offering potential useful knowledge behind dialogues for
giving the final answers. We evaluate both API-based and open-source LLMs
including GPT-4, ChatGPT, and OpenChat across twelve tasks. Experimental
results show that RiC can yield significant improvement compared with various
baselines
Generative Speech Recognition Error Correction with Large Language Models and Task-Activating Prompting
We explore the ability of large language models (LLMs) to act as speech
recognition post-processors that perform rescoring and error correction. Our
first focus is on instruction prompting to let LLMs perform these task without
fine-tuning, for which we evaluate different prompting schemes, both zero- and
few-shot in-context learning, and a novel task activation prompting method that
combines causal instructions and demonstration to increase its context windows.
Next, we show that rescoring only by in-context learning with frozen LLMs
achieves results that are competitive with rescoring by domain-tuned LMs, using
a pretrained first-pass recognition system and rescoring output on two
out-of-domain tasks (ATIS and WSJ). By combining prompting techniques with
fine-tuning we achieve error rates below the N-best oracle level, showcasing
the generalization power of the LLMs.Comment: Accepted to IEEE Automatic Speech Recognition and Understanding
(ASRU) 2023. 8 pages. 2nd version revised from Sep 29th's versio
Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks
In the realm of spoken language understanding (SLU), numerous natural
language understanding (NLU) methodologies have been adapted by supplying large
language models (LLMs) with transcribed speech instead of conventional written
text. In real-world scenarios, prior to input into an LLM, an automated speech
recognition (ASR) system generates an output transcript hypothesis, where
inherent errors can degrade subsequent SLU tasks. Here we introduce a method
that utilizes the ASR system's lattice output instead of relying solely on the
top hypothesis, aiming to encapsulate speech ambiguities and enhance SLU
outcomes. Our in-context learning experiments, covering spoken question
answering and intent classification, underline the LLM's resilience to noisy
speech transcripts with the help of word confusion networks from lattices,
bridging the SLU performance gap between using the top ASR hypothesis and an
oracle upper bound. Additionally, we delve into the LLM's robustness to varying
ASR performance conditions and scrutinize the aspects of in-context learning
which prove the most influential.Comment: Accepted to ICASSP 202
Psychometric properties of the Chinese version of the preoperative assessment of readiness tool among surgical patients
BackgroundThe evaluation of the surgical readiness of patients plays an important role in clinical care. Preoperative readiness assessment is needed to identify the inadequacy among surgical patients, which provides guide for interventions to improve patients’ preoperative readiness. However, there is a paucity of high-level, quality tool that evaluate surgical readiness of patients in China. The purpose of this study is to translate the Preoperative Assessment of Readiness Tool (PART) into Chinese and determine the reliability and validity of the Chinese version in the population of surgical patients.MethodsUsing a standard translation-backward method, the original English version of PART was translated into Chinese. A convenient sampling of 210 surgical patients was recruited from 6 hospitals in Zhejiang Province to test the psychometric properties of this scale including internal consistency, split-half reliability, content validity, structure validity, and floor/ceiling effect.ResultsA total of 194 patients (92%) completed questionnaires. The Chinese version of PART achieved Cronbach’s alphas 0.948 and McDonald’s omega coefficient 0.947, respectively, for the full scale. The estimated odd-even split-half reliability was 0.959. The scale-level content validity index was 0.867, and the items content validity index ranged from 0.83 to 1.0.The output of confirmatory factor analysis (CFA) revealed a two-factor model (χ2 = 510.96; df = 86; p < 0.001; root mean square error approximation = 0.08) with no floor/ceiling effect.ConclusionThe Chinese version of PART demonstrated acceptable reliability and validity among surgical patients. It can be used to evaluate patients’ preoperative preparation and help health professionals provide proper preoperative support
Media use degree and depression: A latent profile analysis from Chinese residents
BackgroundPrevious studies have emphasized the media as an essential channel for understanding information about depression. However, they have not divided groups according to the degree of media use to study their differences in depression. Therefore, this study aims to explore the influence of media use on depression and the influencing factors of depression in people with different media use degrees.MethodsBased on seven items related to media use, a total of 11, 031 respondents were categorized by the frequency of media use using latent profile analysis (LPA). Secondly, multiple linear regression analyzes were conducted to analyze the effects of depression in people with different degrees of media use. Finally, factors influencing depression among people with different degrees of media use were explored separately.ResultsAll respondents were classified into three groups: media use low-frequency (9.7%), media use general (67.1%), and media use high-frequency (23.2%). Compared with media use general group, media use low-frequency (β = 0.019, p = 0.044) and media use high-frequency (β = 0.238, p < 0.001) groups are significantly associated with depression. The factors influencing depression in the population differed between media use low-frequency, media use general, and media use high-frequency groups.ConclusionThe government and the appropriate departments should develop targeted strategies for improving the overall health status of people with different media use degrees
Media use degree and depression: A latent profile analysis from Chinese residents
Background: Previous studies have emphasized the media as an essential channel for understanding information about depression. However, they have not divided groups according to the degree of media use to study their differences in depression. Therefore, this study aims to explore the influence of media use on depression and the influencing factors of depression in people with different media use degrees. Methods: Based on seven items related to media use, a total of 11, 031 respondents were categorized by the frequency of media use using latent profile analysis (LPA). Secondly, multiple linear regression analyzes were conducted to analyze the effects of depression in people with different degrees of media use. Finally, factors influencing depression among people with different degrees of media use were explored separately. Results: All respondents were classified into three groups: media use low-frequency (9.7%), media use general (67.1%), and media use high-frequency (23.2%). Compared with media use general group, media use low-frequency (β = 0.019, p = 0.044) and media use high-frequency (β = 0.238, p < 0.001) groups are significantly associated with depression. The factors influencing depression in the population differed between media use low-frequency, media use general, and media use high-frequency groups. Conclusion: The government and the appropriate departments should develop targeted strategies for improving the overall health status of people with different media use degrees
Research progress of 3D printed poly (ether ether ketone) in the reconstruction of craniomaxillofacial bone defects
The clinical challenge of bone defects in the craniomaxillofacial region, which can lead to significant physiological dysfunction and psychological distress, persists due to the complex and unique anatomy of craniomaxillofacial bones. These critical-sized defects require the use of bone grafts or substitutes for effective reconstruction. However, current biomaterials and methods have specific limitations in meeting the clinical demands for structural reinforcement, mechanical support, exceptional biological performance, and aesthetically pleasing reconstruction of the facial structure. These drawbacks have led to a growing need for novel materials and technologies. The growing development of 3D printing can offer significant advantages to address these issues, as demonstrated by the fabrication of patient-specific bioactive constructs with controlled structural design for complex bone defects in medical applications using this technology. Poly (ether ether ketone) (PEEK), among a number of materials used, is gaining recognition as a feasible substitute for a customized structure that closely resembles natural bone. It has proven to be an excellent, conformable, and 3D-printable material with the potential to replace traditional autografts and titanium implants. However, its biological inertness poses certain limitations. Therefore, this review summarizes the distinctive features of craniomaxillofacial bones and current methods for bone reconstruction, and then focuses on the increasingly applied 3D printed PEEK constructs in this field and an update on the advanced modifications for improved mechanical properties, biological performance, and antibacterial capacity. Exploring the potential of 3D printed PEEK is expected to lead to more cost-effective, biocompatible, and personalized treatment of craniomaxillofacial bone defects in clinical applications
Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition
We propose a neural language modeling system based on low-rank adaptation
(LoRA) for speech recognition output rescoring. Although pretrained language
models (LMs) like BERT have shown superior performance in second-pass
rescoring, the high computational cost of scaling up the pretraining stage and
adapting the pretrained models to specific domains limit their practical use in
rescoring. Here we present a method based on low-rank decomposition to train a
rescoring BERT model and adapt it to new domains using only a fraction (0.08%)
of the pretrained parameters. These inserted matrices are optimized through a
discriminative training objective along with a correlation-based regularization
loss. The proposed low-rank adaptation Rescore-BERT (LoRB) architecture is
evaluated on LibriSpeech and internal datasets with decreased training times by
factors between 5.4 and 3.6.Comment: Accepted to IEEE ASRU 2023. Internal Review Approved. Revised 2nd
version with Andreas and Huck. The first version is in Sep 29th. 8 page
- …