113 research outputs found

    Creating the Open Wordnet Bahasa

    Get PDF

    An Attempt to Create an Automatic Scoring Tool of Short Text Answer in Bahasa Indonesia

    Full text link
    Closed questions offer poor information on student's ability to manage and apply knowledge. On the other hand, open questions have advantages because it may be used to grasp students' conceptual maturity and ability of communication. However, scoring open question answer is not trivial and time-consuming so an automatic scoring tool becomes necessary. An attempt was made to create a scoring tool for open and short text question answer in Bahasa Indonesia that resembles the way school teachers do scoring. Automatic scoring of a student answer was based on the similarity between the answer and predefined key answers. The proposed automatic scoring tool has a form of correlation with human scoring so that the model may be used to predict teacher scoring

    An Attempt to Create an Automatic Scoring Tool of Short Text Answer in Bahasa Indonesia

    Get PDF
    Closed questions offer poor information on student's ability to manage and apply knowledge. On the other hand, open questions have advantages because it may be used to grasp students' conceptual maturity and ability of communication. However, scoring open question answer is not trivial and time-consuming so an automatic scoring tool becomes necessary. An attempt was made to create a scoring tool for open and short text question answer in Bahasa Indonesia that resembles the way school teachers do scoring. Automatic scoring of a student answer was based on the similarity between the answer and predefined key answers. The proposed automatic scoring tool has a form of correlation with human scoring so that the model may be used to predict teacher scoring

    Generating a Malay sentiment lexicon based on wordnet

    Get PDF
    Sentiment lexicon is a list of vocabularies that consists of positive and negative words. In opinion mining, sentiment lexicon is one of the important source in text polarity classification task in sentiment analysis model. Studies in Malay sentiment analysis is increasing since the volume of sentiment data is growing on social media. Therefore, requirement in Malay sentiment lexicon is high. However, Malay sentiment lexicon development is a difficult task due to the scarcity of Malay language resource. Thus, various approaches and techniques are used to generate sentiment lexicon. The objective of this paper is to develop Malay sentiment lexicon generation algorithm based on WordNet. In this study, the method is to map the WordNet Bahasa with English WordNet to get the offset value of a seed set of sentiment words. The seed set is used to generate the synonym and antonym semantic relation in English WordNet. The highest result achives 86.58% agreement with human annotators and 91.31% F1-measure in word polarity classification. The result shows the effectiveness of the proposed algorithm to generate Malay sentiment lexicon based on WordNet

    Automatically generating a sentiment lexicon for the Malay language

    Get PDF
    This paper aims to propose an automated sentiment lexicon generation model specifically designed for the Malay language. Lexicon-based Sentiment Analysis (SA) models make use of a sentiment lexicon for SA tasks, which is a linguistic resource that comprises a priori information about the sentiment properties of words. A sentiment lexicon is an indispensable resource for SA tasks. This is evident in the emergence of a large volume of research focused on the development of sentiment lexicon generation algorithms. This is not the case for low-resource languages such as Malay, for which there is a lack of research focused on this particular area. This has brought up the motivation to propose a sentiment lexicon generation algorithm for this language. WordNet Bahasa was first mapped onto the English WordNet to construct a multilingual word network. A seed set of prototypical positive and negative terms was then automatically expanded by recursively adding terms linked via WordNet’s synonymy and antonymy semantic relations. The underlying intuition is that the sentiment properties of newly added terms via these relations are preserved. A supervised classifier was employed for the word-polarity tagging task, with textual representations of the expanded seed set as features. Evaluation of the model against the General Inquirer lexicon as a benchmark demonstrates that it performs with reasonable accuracy. This paper aims to provide a foundation for further research for the Malay language in this area

    Automatic Text Summarization Based on Semantic Networks and Corpus Statistics

    Get PDF
    One simple automatic text summarization method that can minimize redundancy, in summary, is the Maximum Marginal Relevance (MMR) method. The MMR method has the disadvantage of having parts that are separated from each other in summary results that are not semantically connected. Therefore, this study aims to compare summary results using the MMR method based on semantic and non-semantic based MMR. Semantic-based MMR methods utilize WordNet Bahasa and corpus in processing text summaries. The MMR method is non-semantic based on the TF-IDF method. This study also carried out summary compression of 30%, 20%, and 10%. The research data used is 50 online news texts. Testing of the summary text results is done using the ROUGE toolkit. The results of the study state that the best value of the f-score in the semantic-based MMR method is 0.561, while the best f-score in the non-semantic MMR method is 0.598. This value is generated by adding a preprocessing process in the form of stemming and compression of a 30% summary result. The difference in value obtained is due to incomplete WordNet Bahasa and there are several words in the news title that are not in accordance with EYD (KBBI)

    Lexical Diversity in Kinship Across Languages and Dialects

    Full text link
    Languages are known to describe the world in diverse ways. Across lexicons, diversity is pervasive, appearing through phenomena such as lexical gaps and untranslatability. However, in computational resources, such as multilingual lexical databases, diversity is hardly ever represented. In this paper, we introduce a method to enrich computational lexicons with content relating to linguistic diversity. The method is verified through two large-scale case studies on kinship terminology, a domain known to be diverse across languages and cultures: one case study deals with seven Arabic dialects, while the other one with three Indonesian languages. Our results, made available as browseable and downloadable computational resources, extend prior linguistics research on kinship terminology, and provide insight into the extent of diversity even within linguistically and culturally close communities

    Semantic-based Ontology for Malay Qur'an Reader

    Get PDF
    The Quran has been translated into various languages around the world by Muslim experts. One of them is in Malay. There are numerous applications built to facilitate the retrieval of knowledge from the Malay Qur’an. However, there are limited resources and tools that are available or made accessible for the research on Malay Qur’an. Furthermore, there are several issues that need to be considered when dealing with Malay Qur’an translation; such as ambiguities of words, lack of equivalence words between Malay and English or Malay and Arabic, and different structures of word, sentence, and discourse in these two languages. Therefore, this research summarizes the search techniques used in existing research on Qur’an. Moreover, this paper also studied the previous research conducted on Qur’an Semantic Search and Quran Ontology-Based Search focusing on Malay Qur’an. This review helps the research in addressing the general problems and limitations in Malay Qur’an that influence its accessibility. This research proposed the research framework for new semantic based ontology for Malay Qur’an. The final outcome will be an accessible tool that can help a Malay reader to understand the Qur’an in better ways
    • …
    corecore