Search CORE

2,158 research outputs found

From Spelling to Grammar: A New Framework for Chinese Grammatical Error Correction

Author: Wu Xiuyu
Wu Yunfang
Publication venue
Publication date: 03/11/2022
Field of study

Chinese Grammatical Error Correction (CGEC) aims to generate a correct sentence from an erroneous sequence, where different kinds of errors are mixed. This paper divides the CGEC task into two steps, namely spelling error correction and grammatical error correction. Specifically, we propose a novel zero-shot approach for spelling error correction, which is simple but effective, obtaining a high precision to avoid error accumulation of the pipeline structure. To handle grammatical error correction, we design part-of-speech (POS) features and semantic class features to enhance the neural network model, and propose an auxiliary task to predict the POS sequence of the target sentence. Our proposed framework achieves a 42.11 F0.5 score on CGEC dataset without using any synthetic data or data augmentation methods, which outperforms the previous state-of-the-art by a wide margin of 1.30 points. Moreover, our model produces meaningful POS representations that capture different POS words and convey reasonable POS transition rules

arXiv.org e-Print Archive

Syntactic Frequency and Sentence Processing in Standard Indonesian:Data from agrammatic aphasia and ERP

Author: Jap Bernard
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2020
Field of study

Aphasia is a language impairment caused by focal brain damage affecting multiple channels of language. Studies have shown that one third of stroke patients show some form of aphasia. One of the key characteristics of aphasia is that in most types, patients show deficits in sentence processing. This is so much so that many aphasia assessment tools utilize sentence comprehension or production tasks to determine aphasia type or severity, or perhaps to provide a more detailed profile on the symptoms. Individuals with aphasia have been known to face difficulties in processing sentences with a derived or non-canonical structure, like the passive. While numerous studies have discussed the morphosyntactic basis of this deficit, other aspects of sentence processing such as frequency of the sentence structures are often neglected. There is considerable possibility of syntactic frequency affecting sentence processing, as a large body of research has shown the impact of word-level frequency towards language processing. Could the impairment of processing non-canonical sentences be related to the low frequency of these sentences?This thesis examines sentence processing in Standard Indonesian, a language where the passive occurs at a rate that is comparable to active sentences. Individuals with aphasia and controls were tested with sentence comprehension and production tasks, and an event-related potential study of sentence processing for healthy adults were conducted. We found the passive to be unimpaired for aphasic individuals, and we also did not find any observable processing differences between the active and the passive in the neuroimaging experiment

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

A Named Entity Recognition Method Enhanced with Lexicon Information and Text Local Feature

Author: Gao Chang
Liu He
Liu Yujue
Ma Yuekun
Zhang Dezheng
Publication venue: Faculty of Mechanical Engineering in Slavonski Brod; Faculty of Electrical Engineering, Computer Science and Information Technology Osijek; Faculty of Civil Engineering in Osijek
Publication date: 01/01/2023
Field of study

At present, Named Entity Recognition (NER) is one of the fundamental tasks for extracting knowledge from traditional Chinese medicine (TCM) texts. The variability of the length of TCM entities and the characteristics of the language of TCM texts lead to ambiguity of TCM entity boundaries. In addition, better extracting and exploiting local features of text can improve the accuracy of named entity recognition. In this paper, we proposed a TCM NER model with lexicon information and text local feature enhancement of text. In this model, a lexicon is introduced to encode the characters in the text to obtain the context-sensitive global semantic representation of the text. The convolutional neural network (CNN) and gate joined collaborative attention network are used to form a text local feature extraction module to capture the important semantic features of local text. Experiments were conducted on two TCM domain datasets and the F1 values are 91.13% and 90.21% respectively

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Uncovering the myth of learning to read Chinese characters: phonetic, semantic, and orthographic strategies used by Chinese as foreign language learners

Author: Tong SX
Yip J.
Publication venue: 'Surface Analysis Society of Japan'
Publication date: 01/01/2012
Field of study

Oral Session - 6A: Lexical modeling: no. 6A.3Chinese is considered to be one of the most challenging orthographies to be learned by non-native speakers, in particular, the character. Chinese character is the basic reading unit that converges sound, form and meaning. The predominant type of Chinese character is semantic-phonetic compound that is composed of phonetic and semantic radicals, giving the clues of the sound and meaning, respectively. Over the last two decades, psycholinguistic research has made significant progress in specifying the roles of phonetic and semantic radicals in character processing among native Chinese speakers …postprin

HKU Scholars Hub

(Dis)connections between specific language impairment and dyslexia in Chinese

Author: Au TKF
Ho CSH
Kidd JC
Lam CCC
Wong AMY
Yip LPW
Publication venue: 'Surface Analysis Society of Japan'
Publication date: 01/01/2012
Field of study

Poster Session: no. 26P.40Specific language impairment (SLI) and dyslexia describe language-learning impairments that occur in the absence of a sensory, cognitive, or psychosocial impairment. SLI is primarily defined by an impairment in oral language, and dyslexia by a deficit in the reading of written words. SLI and dyslexia co-occur in school-age children learning English, with rates ranging from 17% to 75%. For children learning Chinese, SLI and dyslexia also co-occur. Wong et al. (2010) first reported on the presence of dyslexia in a clinical sample of 6- to 11-year-old school-age children with SLI. The study compared the reading-related cognitive skills of children with SLI and dyslexia (SLI-D) with 2 groups of children …postprin

HKU Scholars Hub