5,028 research outputs found

    Multi-document Summarization Based on Sentence Clustering Improved Using Topic Words

    Full text link
    Informasi dalam bentuk teks berita telah menjadi salah satu komoditas yang paling penting dalam era informasi ini. Ada banyak berita yang dihasilkan sehari-hari, tetapi berita-berita ini sering memberikan konten kontekstual yang sama dengan narasi berbeda. Oleh karena itu, diperlukan metode untuk mengumpulkan informasi ini ke dalam ringkasan sederhana. Di antara sejumlah subtugas yang terlibat dalam peringkasan multi-dokumen termasuk ekstraksi kalimat, deteksi topik, ekstraksi kalimat representatif, dan kalimat rep-resentatif. Dalam tulisan ini, kami mengusulkan metode baru untuk merepresentasikan kalimat ber-dasarkan kata kunci dari topic teks menggunakan Latent Dirichlet Allocation (LDA). Metode ini terdiri dari tiga langkah dasar. Pertama, kami mengelompokkan kalimat di set dokumen menggunakan kesamaan histogram pengelompokan (SHC). Selanjutnya, peringkat cluster menggunakan klaster penting. Terakhir, kalimat perwakilan yang dipilih oleh topik diidentifikasi pada LDA. Metode yang diusulkan diuji pada dataset DUC2004. Hasil penelitian menunjukkan rata-rata 0,3419 dan 0,0766 untuk ROUGE-1 dan ROUGE-2, masing-masing. Selain itu, dari pembaca prespective, metode kami diusulkan menyajikan pengaturan yang koheren dan baik dalam memesan kalimat representatif, sehingga dapat mempermudah pemahaman bacaan dan mengurangi waktu yang dibutuhkan untuk membaca ringkasan

    Text Summarization Techniques: A Brief Survey

    Get PDF
    In recent years, there has been a explosion in the amount of text data from a variety of sources. This volume of text is an invaluable source of information and knowledge which needs to be effectively summarized to be useful. In this review, the main approaches to automatic text summarization are described. We review the different processes for summarization and describe the effectiveness and shortcomings of the different methods.Comment: Some of references format have update

    Query-Based Summarization using Rhetorical Structure Theory

    Get PDF
    Research on Question Answering is focused mainly on classifying the question type and finding the answer. Presenting the answer in a way that suits the user’s needs has received little attention. This paper shows how existing question answering systems—which aim at finding precise answers to questions—can be improved by exploiting summarization techniques to extract more than just the answer from the document in which the answer resides. This is done using a graph search algorithm which searches for relevant sentences in the discourse structure, which is represented as a graph. The Rhetorical Structure Theory (RST) is used to create a graph representation of a text document. The output is an extensive answer, which not only answers the question, but also gives the user an opportunity to assess the accuracy of the answer (is this what I am looking for?), and to find additional information that is related to the question, and which may satisfy an information need. This has been implemented in a working multimodal question answering system where it operates with two independently developed question answering modules
    • …
    corecore