1,730 research outputs found


    Language testing is a field of study related to the assessment of one's proficiency in the mastery of language that includes 4 (four) basic competencies such as listening, speaking, reading, and writing. The assessment toward the four basic competencies of language will determine the level of one’s ability to master a specific language. Writing is one of the language skills that have been considered as necessary in proficiency language testing. In designing a writing task, one needs to carry out three steps, namely: defining the task, exploring the expectations for the task, and providing support and explanatory materials


    There are universal aspects in language. Phonology, as the most universal languagecomponent, has many universal aspects including nasal assimilation. Nasal assimilation isthe systematic appearance of certain nasals instead of other nasals based on the context in monomorphemic or polymorphemic words. The nasal /n/ occurs successively with alveolarconsonants, the nasal /m/ with labials, the nasal /ɳ/ with velars, and the nasal /ɲ/ withpalatals. Nasal assimilation mostly occurs regressively. Regressively, nasal assimilationtends to occur in monomorphemic and polymorphemic words. In progressive assimilation, ittends to occur in a specific phrase structure. This phenomenon can happen acrosslanguages

    An improved neural network model for joint POS tagging and dependency parsing

    We propose a novel neural network model for joint part-of-speech (POS) tagging and dependency parsing. Our model extends the well-known BIST graph-based dependency parser (Kiperwasser and Goldberg, 2016) by incorporating a BiLSTM-based tagging component to produce automatically predicted POS tags for the parser. On the benchmark English Penn treebank, our model obtains strong UAS and LAS scores at 94.51% and 92.87%, respectively, producing 1.5+% absolute improvements to the BIST graph-based parser, and also obtaining a state-of-the-art POS tagging accuracy at 97.97%. Furthermore, experimental results on parsing 61 "big" Universal Dependencies treebanks from raw texts show that our model outperforms the baseline UDPipe (Straka and Strakov\'a, 2017) with 0.8% higher average POS tagging score and 3.6% higher average LAS score. In addition, with our model, we also obtain state-of-the-art downstream task scores for biomedical event extraction and opinion analysis applications. Our code is available together with all pre-trained models at: https://github.com/datquocnguyen/jPTDPComment: 11 pages; In Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, to appea


    Perbendaharaan kosakata bahasa Indonesia diperkaya oleh kata-kata serapan dari berbagai bahasa asing, misalnya dari bahasa Inggris, Jerman, Belanda, Prancis, dan Arab. Dalam penulisan ini penulis akan membahas mengenai beberapa kata serapan dari bahasa Inggris ke dalam bahasa Indonesia. Penyerapan kosakata serapan bahasa Inggris ke dalam bahasa Indonesia melalui salah satu cara yaitu proses adaptasi. Proses adaptasi terjadi apabila pemakai bahasa hanya mengambil makna kata asing khususnya bahasa Inggris yang diserap dan ejaan atau cara penulisannya disesuaikan kaidah-kaidah ejaan bahasa Indonesia. Kata-kata bahasa Inggris mengalami perubahan ejaan dari bahasa asalnya.. Penulis mendapatkan data kosakata serapan bahasa Inggris ke dalam bahasa Indonesia dari pedoman pengadaptasian adalah Pedoman Penulisan Istilah dan Ejaan Bahasa Indonesia yang Disempurnakan yang dikeluarkan oleh Pusat Bahasa, Departemen Pendidikan Nasional. Metode yang digunakan penulis adalah metode literature dengan mengumpulkan beberapa data kosakata serapan bahasa Inggris ke dalam bahasa Indonesia dan kemudian menganalisa data kosakata tersebut ke dalam proses adaptasi. Dari data analisa tersebut, penulis dapat menyimpulkan bahwa terdapat hubungan antara proses adaptasi dengan kajian morfologi dimana faktor morfologi memegang peranan penting dalam proses adaptasi. Hal ini terlihat dari hasil adaptasi penyerapan kosakata bahasa Inggris yang diidentifikasikan ke dalam perubahan fonem, perubahan monoftongisasi dan proses perubahan afiksasi

    Does Quartile Matter? Investigating syntactic complexity of international publication

    With the challenge of international publication, this study compares the syntactic complexity of Indonesian scholars' publications. The analysis covers 21 journal articles from two groups, from the journals with quartile and without quartile. Using 14 syntactic complexity measures, the results show that the journal articles with quartile have higher mean scores of syntactic complexity measures than the non-quartiles. However, significant differences only occur in three groups of measurement: Length of Production Unit, Coordination, and Degree of Phrasal Sophistication. The findings may show the performance gap between groups as the syntactical constructions of the journal articles with quartile surpass the non-quartile

    Label Pre-annotation for Building Non-projective Dependency Treebanks for French

    posterInternational audienceThe current interest in accurate dependency parsing make it necessary to build dependency treebanks for French containing both projective and non-projective dependencies. In order to alleviate the work of the annotator, we propose to automatically pre-annotate the sentences with the labels of the dependencies ending on the words. The selection of the dependency labels reduces the ambiguity of the parsing. We show that a maximum entropy Markov model method reaches the label accuracy score of a standard dependency parser (MaltParser). Moreover, this method allows to find more than one label per word, i.e. the more probable ones, in order to improve the recall score. It improves the quality of the parsing step of the annotation process. Therefore, the inclusion of the method in the process of annotation makes the work quicker and more natural to annotators

    Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop

    Indonesian and Malay are underrepresented in the development of natural language processing (NLP) technologies and available resources are difficult to find. A clear picture of existing work can invigorate and inform how researchers conceptualise worthwhile projects. Using an education sector project to motivate the study, we conducted a wide-ranging overview of Indonesian and Malay human language technologies and corpus work. We charted 657 included studies according to Hirschberg and Manning's 2015 description of NLP, concluding that the field was dominated by exploratory corpus work, machine reading of text gathered from the Internet, and sentiment analysis. In this paper, we identify most published authors and research hubs, and make a number of recommendations to encourage future collaboration and efficiency within NLP in Indonesian and Malay
