1 research outputs found
Towards Computational Linguistics in Minangkabau Language: Studies on Sentiment Analysis and Machine Translation
Although some linguists (Rusmali et al., 1985; Crouch, 2009) have fairly
attempted to define the morphology and syntax of Minangkabau, information
processing in this language is still absent due to the scarcity of the
annotated resource. In this work, we release two Minangkabau corpora: sentiment
analysis and machine translation that are harvested and constructed from
Twitter and Wikipedia. We conduct the first computational linguistics in
Minangkabau language employing classic machine learning and
sequence-to-sequence models such as LSTM and Transformer. Our first experiments
show that the classification performance over Minangkabau text significantly
drops when tested with the model trained in Indonesian. Whereas, in the machine
translation experiment, a simple word-to-word translation using a bilingual
dictionary outperforms LSTM and Transformer model in terms of BLEU score.Comment: Accepted at PACLIC 2020 - The 34th Pacific Asia Conference on
Language, Information and Computatio