5 research outputs found

    Analisis Topik dan Author Publikasi pada Repositori Open Access Journal of Information Systems dengan Menggunakan Metode Author-Topic Models

    Get PDF
    Open Access Journal of Information Systems (OAJIS) merupakan repositori terbuka yang disediakan oleh Jurusan Sistem Informasi ITS untuk seluruh peneliti bidang sistem informasi di seluruh dunia. Dalam menentukan reviewer yang sesuai untuk mengevaluasi publikasi yang masuk, masih dilakukan secara manual berdasarkan publikasi reviewer dan disesuaikan dengan penelitian yang akan dievaluasi. Dengan sistem manual seperti ini, penentuan reviewer terkadang masih kurang tepat sehingga berdampak pada hasil evaluasi penelitian yang masuk ke OAJIS. Author-Topic Models merupakan sebuah metode pengembangan dari LDA, dengan tambahan integrasi informasi mengenai author dari suatu dokumen pada topic modeling. Metode ini dapat digunakan untuk mengetahui tren topik terkini dari waktu ke waktu, mengetahui topik dokumen berdasarkan author, dan penentuan topik dan author untuk dokumen baru yang tidak terdapat dalam kumpulan dokumen yang dimiliki. Dengan kondisi pada OAJIS, dilakukan analisa topik dari setiap penelitian yang masuk menggunakan metode Author-Topic Models sehingga dapat memudahkan pengelola dalam menentukan reviwer. Berdasarkan eksperimen pemodelan topik yang telah dilakukan, diketahui bahwa data yang yang telah skenario stemming menghasilkan model yang paling baik. Dengan korelasinya pada kebutuhan dalam menentukan reviewer, dilakukan pengujian menggunakan similaritas vektor probabilitas. Dari hasil pengujian diketahui bahwa model skenario stemming dan jumlah topik 50 dapat menghasilkan kemampuan prediksi yang lebih akurat, dengan nilai perplexity 128.71 dan nilai topic coherence -1.528. Kata Kunci: Author-Topic Models, OAJIS, Topic Modeling, Probabilistic Vector Similarity. ========================================================================================= Open Access Journal of Information Systems (OAJIS) is an open-repository which is provide by Information Systems Department of Institut Teknologi Sepuluh Nopember for all researchers in information systems field around the world. In considering which reviewer is suitable for evaluating when a new publication is registered, the process is still done manually based on reviwer's publication compared to the new one. A manual process like this makes reviewer assigning process less precise so it will impact the result of evaluation. Author-Topic Models is a developed method from LDA, with additional information integration about the authors of the document in topic modeling. This method can be used to identify the trend of recent topics from time to time, knowing a document's topic based on its authors, and predicting topics and authors for unknown documents. With the condition of OAJIS, we conducted a topic and author analysis for the publication in the repository using Author-Topic Models method so it will make the reviewer assigning process easier. Based on the topic modeling experiment, it is known that data with stemming scenario resulted the best model. With its correlation to reviewer assigning process, a testing was conducted using probability vector similarity. From the testing result, model with stemming scenario and 50 number of topics could result better prediction capability for the reviwer, with the average perplexity score 128.71 and -1.528 topic coherence score. Keyword: Author-Topic Models, OAJIS, Topic Modeling, Probabilistic Vector Similarity

    From Masked Language Modeling to Translation: Non-English Auxiliary Tasks Improve Zero-shot Spoken Language Understanding

    Get PDF
    The lack of publicly available evaluation data for low-resource languages limits progress in Spoken Language Understanding (SLU). As key tasks like intent classification and slot filling require abundant training data, it is desirable to reuse existing data in high-resource languages to develop models for low-resource scenarios. We introduce xSID, a new benchmark for cross-lingual (x) Slot and Intent Detection in 13 languages from 6 language families, including a very low-resource dialect. To tackle the challenge, we propose a joint learning approach, with English SLU training data and non-English auxiliary tasks from raw text, syntax and translation for transfer. We study two setups which differ by type and language coverage of the pre-trained embeddings. Our results show that jointly learning the main tasks with masked language modeling is effective for slots, while machine translation transfer works best for intent classification
    corecore