456 research outputs found

    25 years development of knowledge graph theory: the results and the challenge

    Get PDF
    The project on knowledge graph theory was begun in 1982. At the initial stage, the goal was to use graphs to represent knowledge in the form of an expert system. By the end of the 80's expert systems in medical and social science were developed successfully using knowledge graph theory. In the following stage, the goal of the project was broadened to represent natural language by knowledge graphs. Since then, this theory can be considered as one of the methods to deal with natural language processing. At the present time knowledge graph representation has been proven to be a method that is language independent. The theory can be applied to represent almost any characteristic feature in various languages.\ud The objective of the paper is to summarize the results of 25 years of development of knowledge graph theory and to point out some challenges to be dealt with in the next stage of the development of the theory. The paper will give some highlight on the difference between this theory and other theories like that of conceptual graphs which has been developed and presented by Sowa in 1984 and other theories like that of formal concept analysis by Wille or semantic networks

    ANNOTATION MODEL FOR LOANWORDS IN INDONESIAN CORPUS: A LOCAL GRAMMAR FRAMEWORK

    Get PDF
    There is a considerable number for loanwords in Indonesian language as it has been, or even continuously, in contact with other languages. The contact takes place via different media; one of them is via machine readable medium. As the information in different languages can be obtained by a mouse click these days, the contact becomes more and more intense. This paper aims at proposing an annotation model and lexical resource for loanwords in Indonesian. The lexical resource is applied to a corpus by a corpus processing software called UNITEX. This software works under local grammar framewor

    Chatbot in Bahasa Indonesia using NLP to Provide Banking Information

    Get PDF
    FAQs are mostly provided on the company's website to inform their service and product. It's just that the FAQ is usually less interactive and presents too much information that is less practical. Chatbot can be used as an alternative in providing FAQ. In this study, chatbots were developed for BTPN in providing information about their products, namely Jenius. Chatbot developed utilizes natural language processing so that the system can understand user queries in the form of natural language. The cosine similarity algorithm is used to find similarities between queries and patterns in the knowledge base. Patterns with the highest cosine values are considered to be most similar to user queries. It's just that, this algorithm does not pay attention to the structure of the sentence so that it adds checking the structure of the sentence with the parse tree to give weight to the pattern. This chatbot application has been tested by 10 users and it was found that the suitability of the answers with user input was 84%. Therefore the chatbot developed can be used by BTPN to provide Jenius product information to consumers more interactively and practically

    Wh-copying, phases, and successive cyclicity

    Get PDF

    ANNOTATED DISJUNCT FOR MACHINE TRANSLATION

    Get PDF
    Most information found in the Internet is available in English version. However, most people in the world are non-English speaker. Hence, it will be of great advantage to have reliable Machine Translation tool for those people. There are many approaches for developing Machine Translation (MT) systems, some of them are direct, rule-based/transfer, interlingua, and statistical approaches. This thesis focuses on developing an MT for less resourced languages i.e. languages that do not have available grammar formalism, parser, and corpus, such as some languages in South East Asia. The nonexistence of bilingual corpora motivates us to use direct or transfer approaches. Moreover, the unavailability of grammar formalism and parser in the target languages motivates us to develop a hybrid between direct and transfer approaches. This hybrid approach is referred as a hybrid transfer approach. This approach uses the Annotated Disjunct (ADJ) method. This method, based on Link Grammar (LG) formalism, can theoretically handle one-to-one, many-to-one, and many-to-many word(s) translations. This method consists of transfer rules module which maps source words in a source sentence (SS) into target words in correct position in a target sentence (TS). The developed transfer rules are demonstrated on English → Indonesian translation tasks. An experimental evaluation is conducted to measure the performance of the developed system over available English-Indonesian MT systems. The developed ADJ-based MT system translated simple, compound, and complex English sentences in present, present continuous, present perfect, past, past perfect, and future tenses with better precision than other systems, with the accuracy of 71.17% in Subjective Sentence Error Rate metric

    Parsing struktur semantik soal cerita matematika berbahasa indonesia menggunakan recursive neural network

    Get PDF
    Soal cerita berperan penting untuk kemajuan pengembangan kecerdasan buatan. Hal ini karena penyelesaian soal cerita melibatkan pengembangan sebuah sistem yang mampu memahami bahasa alami. Pembentukan sistem penyelesaian soal memerlukan mekanisme untuk mendekomposisikan teks soal ke segmen-segmen teks untuk diterjemahkan ke jenis operasi hitung. Segmen-segmen tersebut ditentukan melalui proses parsing semantik struktur soal agar menghasilkan segmen-segmen yang maknanya menunjuk operasi hitung. Sejumlah metode usulan saat ini sesuai untuk diterapkan pada soal cerita berbahasa Inggris dan belum diterapkan pada soal cerita berbahasa Indonesia. Dampaknya adalah segmen-segmen yang dihasilkan belum tentu menghasilkan urutan pengerjaan operasi yang sesuai makna cerita. Penelitian ini mengusulkan penggunaaan Recursive Neural Network (RNN) sebagai parser struktur semantik soal cerita berbahasa Indonesia. Pengujian parser struktur semantik soal dilakukan terhadap soal-soal yang berasal dari Buku Sekolah Elektronik (BSE) Sekolah Dasar (SD) dari Pusat Perbukuan Kementerian Pendidikan dan Kebudayaan. Hasil pengujian menunjukkan akurasi akhir sebesar 86,4%.  Math word problems play an important role for the development of artificial intelligent. This is because solving word problems involves the development of a system that can understand natural language.  Designing a system for solving math word problems requires a mechanism for decomposing a text into segments of text to be translated into math operation. The segments are categorized through the process of parsing the semantic structure of the word problems to obtain segments whose meanings refer to math operation. A number of current proposed methods are suitable to be applied to English math word problems and have never been applied to Indonesian math word problems. The impact is that the segments produced are not necessarily in line with the sequences of operations appropriate with the meaning of the story.  This study proposed the use of Recursive Neural Network (RNN) as a parser of semantic structure of Indonesian math word problems. The testing of the parser was carried out on the math word problems taken from the Elementary School’s Electronic School Book  (BSE) published by the Book Center of the Ministry of Education and Culture. The result of the testing showed that the final accuracy was 86.4%

    Systematic Literature Review on Ontology-based Indonesian Question Answering System

    Get PDF
    Question-Answering (QA) systems at the intersection of natural language processing, information retrieval, and knowledge representation aim to provide efficient responses to natural language queries. These systems have seen extensive development in English and languages like Indonesian present unique challenges and opportunities. This literature review paper delves into the state of ontology-based Indonesian QA systems, highlighting critical challenges. The first challenge lies in sentence understanding, variations, and complexity. Most systems rely on syntactic analysis and struggle to grasp sentence semantics. Complex sentences, especially in Indonesian, pose difficulties in parsing, semantic interpretation, and knowledge extraction. Addressing these linguistic intricacies is pivotal for accurate responses. Secondly, template-based SPARQL query construction, commonly used in Indonesian QA systems, suffers from semantic gaps and inflexibility. Advanced techniques like semantic matching algorithms and dynamic template generation can bridge these gaps and adapt to evolving ontologies. Thirdly, lexical gaps and ambiguity hinder QA systems. Bridging vocabulary mismatches between user queries and ontology labels remains a challenge. Strategies like synonym expansion, word embedding, and ontology enrichment must be explored further to overcome these challenges. Lastly, the review discusses the potential of developing multi-domain ontologies to broaden the knowledge coverage of QA systems. While this presents complex linguistic and ontological challenges, it offers the advantage of responding to various user queries across various domains. This literature review identifies crucial challenges in developing ontology-based Indonesian QA systems and suggests innovative approaches to address these challenges

    Perbandingan Pre-trained Word Embedding dan Embedding Layer untuk Named-Entity Recognition Bahasa Indonesia

    Get PDF
    Named-Entity Recognition (NER) is used to extract information from text by identifying entities such as the name of the person, organization, location, time, and other entities. Recently, machine learning approaches, particularly deep-learning, are widely used to recognize patterns of entities in sentences. Embedding, a process to convert text data into a number or vector of numbers, translates high dimensional vectors into relatively low-dimensional space. Embeddings make it easier to do machine learning on large inputs like sparse vectors representing words. The embedding process can be performed using the supervised learning method, which requires a large number of labeled data sets or an unsupervised learning approach. This study compares the two embedding methods; trainable embedding layer (supervised learning) and pre-trained word embedding (unsupervised learning).  The trainable embedding layer uses the embedding layer provided by the Keras library while pre-trained word embedding uses word2vec, GloVe, and fastText to build NER using the BiLSTM architecture. The results show that GloVe had better performance than other embedding techniques with a micro average f1 score of 76.48

    ERRORS BY AUTO-MORPHOLOGICAL ANALYSIS IN A CHILDREN STORY CORPUS: AN EVALUATION OF MORPHIND PROGRAM

    Get PDF
    Indonesian Morphological Tool, Morphind, is meant to make a proper morphological analysis before doing further automatic language processing.Morphind is applied to enrich raw Indonesian text with morphological information, the preprocessing stage of an Indonesian corpus. In this study, the data is obtained from children's stories in the website ceritaanak.org by taking 500 types of total 2101 types. The purpose of this study is to identify and classify the types of errors present in data processing using morphind program. In the analalysis I uses the method Introspective and Dictionary Indonesian (KBBI) to validate the analysis. The findings of this research suggest that there are still many aspects that can be improved about morphind. Recommendations are fixing the data base especially for OOV (out of vocabulary) and dictionary accuracy, improving the display for the Allomorph, and improving the algorithm for morpheme segmentation
    • …
    corecore