1,730 research outputs found
DESIGNING WRITING TEST
Language testing is a field of study related to the assessment of one's proficiency in the
mastery of language that includes 4 (four) basic competencies such as listening, speaking,
reading, and writing. The assessment toward the four basic competencies of language will
determine the level of one’s ability to master a specific language.
Writing is one of the language skills that have been considered as necessary in
proficiency language testing. In designing a writing task, one needs to carry out three steps,
namely: defining the task, exploring the expectations for the task, and providing support and
explanatory materials
UNIVERSAL NASAL ASSIMILATIONS IN MONOMORPHEMIC AND POLYMORPHEMIC WORDS ACROSS LANGUAGES
There are universal aspects in language. Phonology, as the most universal languagecomponent, has many universal aspects including nasal assimilation. Nasal assimilation isthe systematic appearance of certain nasals instead of other nasals based on the context in
monomorphemic or polymorphemic words. The nasal /n/ occurs successively with alveolarconsonants, the nasal /m/ with labials, the nasal /ɳ/ with velars, and the nasal /ɲ/ withpalatals. Nasal assimilation mostly occurs regressively. Regressively, nasal assimilationtends to occur in monomorphemic and polymorphemic words. In progressive assimilation, ittends to occur in a specific phrase structure. This phenomenon can happen acrosslanguages
An improved neural network model for joint POS tagging and dependency parsing
We propose a novel neural network model for joint part-of-speech (POS)
tagging and dependency parsing. Our model extends the well-known BIST
graph-based dependency parser (Kiperwasser and Goldberg, 2016) by incorporating
a BiLSTM-based tagging component to produce automatically predicted POS tags
for the parser. On the benchmark English Penn treebank, our model obtains
strong UAS and LAS scores at 94.51% and 92.87%, respectively, producing 1.5+%
absolute improvements to the BIST graph-based parser, and also obtaining a
state-of-the-art POS tagging accuracy at 97.97%. Furthermore, experimental
results on parsing 61 "big" Universal Dependencies treebanks from raw texts
show that our model outperforms the baseline UDPipe (Straka and Strakov\'a,
2017) with 0.8% higher average POS tagging score and 3.6% higher average LAS
score. In addition, with our model, we also obtain state-of-the-art downstream
task scores for biomedical event extraction and opinion analysis applications.
Our code is available together with all pre-trained models at:
https://github.com/datquocnguyen/jPTDPComment: 11 pages; In Proceedings of the CoNLL 2018 Shared Task: Multilingual
Parsing from Raw Text to Universal Dependencies, to appea
PROSES ADAPTASI PENYERAPAN KOSAKATA BAHASA INGGRIS KE DALAM BAHASA INDONESIA: SEBUAH KAJIAN MORFOLOGI
Perbendaharaan kosakata bahasa Indonesia diperkaya oleh kata-kata serapan dari
berbagai bahasa asing, misalnya dari bahasa Inggris, Jerman, Belanda, Prancis, dan Arab.
Dalam penulisan ini penulis akan membahas mengenai beberapa kata serapan dari bahasa
Inggris ke dalam bahasa Indonesia. Penyerapan kosakata serapan bahasa Inggris ke dalam
bahasa Indonesia melalui salah satu cara yaitu proses adaptasi. Proses adaptasi terjadi
apabila pemakai bahasa hanya mengambil makna kata asing khususnya bahasa Inggris
yang diserap dan ejaan atau cara penulisannya disesuaikan kaidah-kaidah ejaan bahasa
Indonesia. Kata-kata bahasa Inggris mengalami perubahan ejaan dari bahasa asalnya..
Penulis mendapatkan data kosakata serapan bahasa Inggris ke dalam bahasa
Indonesia dari pedoman pengadaptasian adalah Pedoman Penulisan Istilah dan Ejaan
Bahasa Indonesia yang Disempurnakan yang dikeluarkan oleh Pusat Bahasa, Departemen
Pendidikan Nasional. Metode yang digunakan penulis adalah metode literature dengan
mengumpulkan beberapa data kosakata serapan bahasa Inggris ke dalam bahasa Indonesia
dan kemudian menganalisa data kosakata tersebut ke dalam proses adaptasi. Dari data
analisa tersebut, penulis dapat menyimpulkan bahwa terdapat hubungan antara proses
adaptasi dengan kajian morfologi dimana faktor morfologi memegang peranan penting
dalam proses adaptasi. Hal ini terlihat dari hasil adaptasi penyerapan kosakata bahasa
Inggris yang diidentifikasikan ke dalam perubahan fonem, perubahan monoftongisasi dan
proses perubahan afiksasi
Does Quartile Matter? Investigating syntactic complexity of international publication
With the challenge of international publication, this study compares the syntactic complexity of Indonesian scholars' publications. The analysis covers 21 journal articles from two groups, from the journals with quartile and without quartile. Using 14 syntactic complexity measures, the results show that the journal articles with quartile have higher mean scores of syntactic complexity measures than the non-quartiles. However, significant differences only occur in three groups of measurement: Length of Production Unit, Coordination, and Degree of Phrasal Sophistication. The findings may show the performance gap between groups as the syntactical constructions of the journal articles with quartile surpass the non-quartile
Label Pre-annotation for Building Non-projective Dependency Treebanks for French
posterInternational audienceThe current interest in accurate dependency parsing make it necessary to build dependency treebanks for French containing both projective and non-projective dependencies. In order to alleviate the work of the annotator, we propose to automatically pre-annotate the sentences with the labels of the dependencies ending on the words. The selection of the dependency labels reduces the ambiguity of the parsing. We show that a maximum entropy Markov model method reaches the label accuracy score of a standard dependency parser (MaltParser). Moreover, this method allows to find more than one label per word, i.e. the more probable ones, in order to improve the recall score. It improves the quality of the parsing step of the annotation process. Therefore, the inclusion of the method in the process of annotation makes the work quicker and more natural to annotators
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
Indonesian and Malay are underrepresented in the development of natural language processing (NLP) technologies and available resources are difficult to find. A clear picture of existing work can invigorate and inform how researchers conceptualise worthwhile projects. Using an education sector project to motivate the study, we conducted a wide-ranging overview of Indonesian and Malay human language technologies and corpus work. We charted 657 included studies according to Hirschberg and Manning's 2015 description of NLP, concluding that the field was dominated by exploratory corpus work, machine reading of text gathered from the Internet, and sentiment analysis. In this paper, we identify most published authors and research hubs, and make a number of recommendations to encourage future collaboration and efficiency within NLP in Indonesian and Malay
- …