Ekstraksi Topik dalam Dataset Menggunakan Teknik Pemodelan Topik

Abstract

The issue in this research is the lack of understanding regarding the main topics and their changes in speeches and media publications related to President Joko Widodo. This study aims to identify, analyze, and predict changes in key topics within speeches, statements, and media publications related to President Joko Widodo using Latent Dirichlet Allocation (LDA) topic modeling techniques. The research employs a quantitative approach to analyze President Joko Widodo's speech texts using the Latent Dirichlet Allocation (LDA) method. The process began with scraping documents from the official website of the Republic of Indonesia's Secretariat, resulting in 5,988 speech transcripts from October 20, 2014, to March 2, 2024. Text preprocessing involved tokenization, stopword removal, and stemming/ lemmatization, followed by dictionary-term formation. The findings indicate that the model with k=16 has the highest coherence (0.554) and the best perplexity at k=21 (-13.130). The main topics identified include Nationalism and National Values, Regional Government, and Education and Children. Topic visualization with PyLDAvis aids in the exploration and identification of topics, providing insights for decision-making and policy development. To enhance understanding of topic changes, it is recommended to conduct trend analysis on key topics over time. This will help identify how President Joko Widodo's priorities shift and respond to new issues. By monitoring these trends, the research can provide deeper insights into the evolution of policies and the President's focus

    Similar works