63 research outputs found
Generative Input: Towards Next-Generation Input Methods Paradigm
Since the release of ChatGPT, generative models have achieved tremendous
success and become the de facto approach for various NLP tasks. However, its
application in the field of input methods remains under-explored. Many neural
network approaches have been applied to the construction of Chinese input
method engines(IMEs).Previous research often assumed that the input pinyin was
correct and focused on Pinyin-to-character(P2C) task, which significantly falls
short of meeting users' demands. Moreover, previous research could not leverage
user feedback to optimize the model and provide personalized results. In this
study, we propose a novel Generative Input paradigm named GeneInput. It uses
prompts to handle all input scenarios and other intelligent auxiliary input
functions, optimizing the model with user feedback to deliver personalized
results. The results demonstrate that we have achieved state-of-the-art
performance for the first time in the Full-mode Key-sequence to
Characters(FK2C) task. We propose a novel reward model training method that
eliminates the need for additional manual annotations and the performance
surpasses GPT-4 in tasks involving intelligent association and conversational
assistance. Compared to traditional paradigms, GeneInput not only demonstrates
superior performance but also exhibits enhanced robustness, scalability, and
online learning capabilities
Bipartite Flat-Graph Network for Nested Named Entity Recognition
In this paper, we propose a novel bipartite flat-graph network (BiFlaG) for
nested named entity recognition (NER), which contains two subgraph modules: a
flat NER module for outermost entities and a graph module for all the entities
located in inner layers. Bidirectional LSTM (BiLSTM) and graph convolutional
network (GCN) are adopted to jointly learn flat entities and their inner
dependencies. Different from previous models, which only consider the
unidirectional delivery of information from innermost layers to outer ones (or
outside-to-inside), our model effectively captures the bidirectional
interaction between them. We first use the entities recognized by the flat NER
module to construct an entity graph, which is fed to the next graph module. The
richer representation learned from graph module carries the dependencies of
inner entities and can be exploited to improve outermost entity predictions.
Experimental results on three standard nested NER datasets demonstrate that our
BiFlaG outperforms previous state-of-the-art models.Comment: Accepted by ACL202
Translating science fiction in a CAT tool:machine translation and segmentation settings
There is increasing interest in machine assistance for literary translation, but research on how computer-assisted translation (CAT) tools and machine translation (MT) combine in the translation of literature is still incipient, especially for non-Europeanlanguages. This article presents two exploratory studies where English-to-Chinese translators used neural MT to translate science fiction short stories in Trados Studio. One of the studies compares post-editing with a ‘no MT’ condition. The other examinestwo ways of presenting the texts on screen for post-editing, namely by segmenting them into paragraphs or into sentences. We collected the data with the Qualititivity plugin for Trados Studio and describe a method for analysing data collected with this plugin through the translation process research database of the Center for Research in Translation and Translation Technology (CRITT). While post-editing required less technical effort, we did not find MT to be appreciably timesaving. Paragraph segmentation was associated with less post-editing effort on average, though with high translator variability. We discuss the results in the light of broader concepts, such as status-quo bias, and call for more research on the different ways in which MT may assist literary translation, including its use for comparison purposes or, as mentioned by a participant, for ‘inspiration’
An Efficient tone classifier for speech recognition of Cantonese.
by Cheng Yat Ho.Thesis (M.Phil.)--Chinese University of Hong Kong, 1991.Bibliography: leaves 106-108.Chapter Chapter 1 --- Introduction --- p.1Chapter Chapter 2 --- Preliminary Considerations --- p.8Chapter 2.1 --- Tone System of Cantonese --- p.8Chapter 2.2 --- Tone Classification Systems --- p.14Chapter 2.3 --- Design of a Speech Corpus --- p.17Chapter Chapter 3 --- Feature Parameters for Tone Classification --- p.22Chapter 3.1 --- Methodology --- p.22Chapter 3.2 --- Endpoint Detection and Time Alignment --- p.23Chapter 3.3 --- Pitch --- p.27Chapter 3.3.1 --- Pitch Profile Extraction --- p.28Chapter 3.3.2 --- Evaluation of Pitch Profile --- p.33Chapter 3.3.3 --- Feature Parameters Derived from Pitch Profile --- p.40Chapter 3.4 --- Duration --- p.46Chapter 3.5 --- Energy --- p.49Chapter 3.5.1 --- Energy Profile Extraction --- p.49Chapter 3.5.2 --- Evaluation of Energy Profile --- p.50Chapter 3.6 --- Summary --- p.54Chapter Chapter 4 --- Implementation of the Tone Classification System --- p.56Chapter 4.1 --- Intrinsic Pitch Estimation --- p.59Chapter 4.2 --- The Classifier --- p.63Chapter 4.2.1 --- Neural Network --- p.64Chapter 4.2.2 --- Post-Processing Unit --- p.74Chapter Chapter 5 --- Performance Evaluation on the Tone Classification System --- p.76Chapter 5.1 --- Single Speaker Tone Classification --- p.77Chapter 5.2 --- Multi-Speaker and Speaker Independent Tone Classification --- p.82Chapter 5.2.1 --- Classification with no Phonetic Information --- p.83Chapter 5.2.2 --- Classification with Known Final Consonants --- p.88Chapter 5.3 --- Confidence Improvement of the Recognition Results --- p.95Chapter 5.4 --- Summary --- p.101Chapter Chapter 6 --- Conclusions and Discussions --- p.102References --- p.106Chapter Appendix A --- Vocabulary of the Speech Corpus --- p.A1-A4Chapter Appendix B --- Statistics of the Pitch Profiles --- p.Bl-Bl5Chapter Appendix C --- Statistics of the Energy Profiles --- p.Cl-Cl1RESULT
- …