32 research outputs found
Suffix Retrieval-Augmented Language Modeling
Causal language modeling (LM) uses word history to predict the next word.
BERT, on the other hand, makes use of bi-directional word information in a
sentence to predict words at masked positions. While BERT is effective in
sequence encoding, it is non-causal by nature and is not designed for sequence
generation. In this paper, we propose a novel language model, SUffix
REtrieval-Augmented LM (SUREALM), that simulates a bi-directional contextual
effect in an autoregressive manner. SUREALM employs an embedding retriever to
search for training sentences in a data store that share similar word history
during sequence generation. In particular, the suffix portions of the retrieved
sentences mimick the "future" context. We evaluated our proposed model on the
DSTC9 spoken dialogue corpus and showed promising word perplexity reduction on
the validation and test set compared to competitive baselines.Comment: 5 pages, 1 figure. Submitted to ICASSP 202
Exploring an LM to generate Prolog Predicates from Mathematics Questions
Recently, there has been a surge in interest in NLP driven by ChatGPT.
ChatGPT, a transformer-based generative language model of substantial scale,
exhibits versatility in performing various tasks based on natural language.
Nevertheless, large language models often exhibit poor performance in solving
mathematics questions that require reasoning. Prior research has demonstrated
the effectiveness of chain-of-thought prompting in enhancing reasoning
capabilities. Now, we aim to investigate whether fine-tuning a model for the
generation of Prolog codes, a logic language, and subsequently passing these
codes to a compiler can further improve accuracy. Consequently, we employ
chain-of-thought to fine-tune LLaMA7B as a baseline model and develop other
fine-tuned LLaMA7B models for the generation of Prolog code, Prolog code +
chain-of-thought, and chain-of-thought + Prolog code, respectively. The results
reveal that the Prolog generation model surpasses the baseline in performance,
while the combination generation models do not yield significant improvements.
The Prolog corpus based on GSM8K and the correspondingly finetuned Prolog
generation model based on LLaMA7B are released to the research community.Comment: 6 pages, 3 figure
Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors
Performance of spoken language understanding (SLU) can be degraded with
automatic speech recognition (ASR) errors. We propose a novel approach to
improve SLU robustness by randomly corrupting clean training text with an ASR
error simulator, followed by self-correcting the errors and minimizing the
target classification loss in a joint manner. In the proposed error simulator,
we leverage confusion networks generated from an ASR decoder without human
transcriptions to generate a variety of error patterns for model training. We
evaluate our approach on the DSTC10 challenge targeted for knowledge-grounded
task-oriented conversational dialogues with ASR errors. Experimental results
show the effectiveness of our proposed approach, boosting the knowledge-seeking
turn detection (KTD) F1 significantly from 0.9433 to 0.9904. Knowledge cluster
classification is boosted from 0.7924 to 0.9333 in Recall@1. After knowledge
document re-ranking, our approach shows significant improvement in all
knowledge selection metrics, from 0.7358 to 0.7806 in Recall@1, from 0.8301 to
0.9333 in Recall@5, and from 0.7798 to 0.8460 in MRR@5 on the test set. In the
recent DSTC10 evaluation, our approach demonstrates significant improvement in
knowledge selection, boosting Recall@1 from 0.495 to 0.7144 compared to the
official baseline. Our source code is released in GitHub
https://github.com/yctam/dstc10_track2_task2.git.Comment: 7 pages, 2 figures. Accepted at ICASSP 202
Bioavailable testosterone predicts a lower risk of Alzheimer’s disease in older men: a 1-year cohort study
Oral Presentationpublished_or_final_versionThe 15th Annual Research Conference of the Department of Medicine, The University of Hong Kong, Hong Kong, 16 January 2010. In Hong Kong Medical Journal, 2010, v. 16, suppl. 1, p. 16, abstract no. 1
PLASER: Pronunciation Learning via Automatic Speech Recognition
PLASER is a multimedia tool with instant feedback designed to teach English pronunciation for high-school students of Hong Kong whose mother tongue is Cantonese Chinese. The objective is to teach correct pronunciation and not to assess a student's overall pronunciation quality. Major challenges related to speech recognition technology include: allowance for non-native accent, reliable and corrective feedbacks, and visualization of errors
Dynamic Language Model Adaptation using Variational Bayes Inference
We propose an unsupervised dynamic language model (LM) adaptation framework using long-distance latent topic mixtures. The framework employs the Latent Dirichlet Allocation model (LDA) which models the latent topics of a document collection in an unsupervised and Bayesian fashion. In the LDA model, each word is modeled as a mixture of latent topics. Varying topics within a context can be modeled by re-sampling the mixture weights of the latent topics from a prior Dirichlet distribution. The model can be trained using the variational Bayes Expectation Maximization algorithm. During decoding, mixture weights of the latent topics are adapted dynamically using the hypotheses of previously decoded utterances. In our work, the LDA model is combined with the trigram language model using linear interpolation. We evaluated the approach on the CCTV episode of the RT04 Mandarin Broadcast News test set. Results show that the proposed approach reduces the perplexity by up to 15.4% relative and the character error rate by 4.9% relative depending on the size and setup of the training set