251 research outputs found

    MILDSum: A Novel Benchmark Dataset for Multilingual Summarization of Indian Legal Case Judgments

    Full text link
    Automatic summarization of legal case judgments is a practically important problem that has attracted substantial research efforts in many countries. In the context of the Indian judiciary, there is an additional complexity -- Indian legal case judgments are mostly written in complex English, but a significant portion of India's population lacks command of the English language. Hence, it is crucial to summarize the legal documents in Indian languages to ensure equitable access to justice. While prior research primarily focuses on summarizing legal case judgments in their source languages, this study presents a pioneering effort toward cross-lingual summarization of English legal documents into Hindi, the most frequently spoken Indian language. We construct the first high-quality legal corpus comprising of 3,122 case judgments from prominent Indian courts in English, along with their summaries in both English and Hindi, drafted by legal practitioners. We benchmark the performance of several diverse summarization approaches on our corpus and demonstrate the need for further research in cross-lingual summarization in the legal domain.Comment: Accepted at EMNLP 2023 (Main Conference

    ํŠธ๋žœ์Šคํฌ๋จธ๋ฅผ ์ด์šฉํ•œ ํ‚ค์›Œ๋“œ ๋ฐ˜์˜ ์ถ”์ถœ ์š”์•ฝ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ํ˜‘๋™๊ณผ์ • ๋ฐ”์ด์˜ค์—”์ง€๋‹ˆ์–ด๋ง์ „๊ณต, 2023. 2. ์ตœ์ง„์šฑ.Text summarization is well-known as a representative task in natural language processing. Text summarization methods generate brief written summaries of documents such as journal articles. In recent years, the performance of text summarization methods has improved significantly with the development of pretrained language models based on Transformer architectures such as BERT and GPT 3. Recently, the development of language models designed to generate controllable output based on user preferences has attracted considerable attention as a topic of active research. Controllable summarization methods such as query-focused or aspect-oriented summarization techniques have also emerged as promising approaches. In particular, aspect-oriented summarization generates a summary in terms of specific aspects provided as user input. In this study, we propose a method to improve the performance of an aspect-oriented extractive summarization model presented in a previous work. The proposed method helps the model to generate aspect-oriented summaries by reflecting the relevance between sentence features and keyword features representing the aspect. To evaluate the performance of the proposed method, we constructed a new dataset consisting of articles on COVID-19 labeled in terms of two aspects: Trend and Action. The results showed that our proposed method outperformed a baseline model on the new dataset. The proposed method exhibited higher performance than the baseline by roughly 3.6โ€“4.3% in terms of Trend, and showed a relatively low impact with an improvement of less than 1% in terms of Action. However, in both aspects, we observed that even incorrect sentences included in a generated summary tended to be related to the defined aspect. Thus, we demonstrate that the proposed method generated more aspect-oriented summaries with content relevant to the defined aspect.ํ…์ŠคํŠธ ์š”์•ฝ(Text Summarization)์€ ์ž์—ฐ์–ด ์ฒ˜๋ฆฌ ๋ถ„์•ผ์˜ ๋Œ€ํ‘œ์ ์ธ ์ž‘์—… ์ค‘ ํ•˜๋‚˜์ด๋‹ค. ํ…์ŠคํŠธ ์š”์•ฝ์˜ ๋ชฉ์ ์€ ์‹ ๋ฌธ ๊ธฐ์‚ฌ์™€ ๊ฐ™์€ ๋ฌธ์„œ๋ฅผ ๊ฐ„๊ฒฐํ•˜์ง€๋งŒ, ํ•ต์‹ฌ์ ์ธ ๋‚ด์šฉ์„ ์ค‘์‹ฌ์œผ๋กœ ์š”์•ฝํ•˜๋Š” ๊ฒƒ์ด๋‹ค. BERT, GPT-3์™€ ๊ฐ™์€ ํŠธ๋žœ์Šคํฌ๋จธ ๊ธฐ๋ฐ˜์˜ ์‚ฌ์ „ํ•™์Šต ๋ชจ๋ธ๋“ค์ด ๊ฐœ๋ฐœ๋จ์— ๋”ฐ๋ผ, ์š”์•ฝ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ํฌ๊ฒŒ ํ–ฅ์ƒ๋˜์—ˆ๋‹ค. ์ตœ๊ทผ์—๋Š” ์‚ฌ์šฉ์ž์˜ ๋ชฉ์  ํ˜น์€ ์„ ํ˜ธ๋„๋ฅผ ๋ฐ˜์˜ํ•˜์—ฌ ์ถœ๋ ฅ์„ ์ƒ์„ฑํ•˜๋Š” ์–ธ์–ด ๋ชจ๋ธ์„ ๊ฐœ๋ฐœํ•˜๊ธฐ ์œ„ํ•ด ๋งŽ์€ ์—ฐ๊ตฌ๋“ค์ด ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ํ…์ŠคํŠธ ์š”์•ฝ ๋ถ„์•ผ์—์„œ๋„ ์ด๋Ÿฌํ•œ ํ๋ฆ„์— ๋”ฐ๋ผ ์ฟผ๋ฆฌ ์ค‘์‹ฌ(Query focused) ํ˜น์€ ์ธก๋ฉด ์ค‘์‹ฌ(Aspect oriented) ์š”์•ฝ๊ณผ ๊ฐ™์ด ์ œ์–ด ๊ฐ€๋Šฅํ•œ ์š”์•ฝ๋ฌธ ์ƒ์„ฑ์— ๋Œ€ํ•œ ์—ฐ๊ตฌ๋“ค์ด ๋“ฑ์žฅํ•˜๊ณ  ์žˆ๋‹ค. ์ธก๋ฉด ์ค‘์‹ฌ ์š”์•ฝ(Aspect oriented)์€ ์‚ฌ์šฉ์ž๊ฐ€ ์•Œ๊ณ  ์‹ถ์€ ํŠน์ • ์ธก๋ฉด์— ๋Œ€ํ•ด์„œ ์š”์•ฝ๋ฌธ์„ ์ƒ์„ฑํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์„ ํ–‰ ์—ฐ๊ตฌ์—์„œ ์ œ์•ˆํ•œ ์ธก๋ฉด ์ค‘์‹ฌ ์š”์•ฝ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ์œ„ํ•œ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ๋ฌธ์žฅ์˜ ํ‘œํ˜„ ๋ฒกํ„ฐ์™€ ์ธก๋ฉด์„ ๋Œ€ํ‘œํ•˜๋Š” ํ‚ค์›Œ๋“œ ํ‘œํ˜„ ๋ฒกํ„ฐ๋“ค ์‚ฌ์ด์˜ ์—ฐ๊ด€์„ฑ์„ ๊ธฐ์กด์˜ ๋ฌธ์žฅ ํ‘œํ˜„ ๋ฒกํ„ฐ์— ๋ฐ˜์˜ํ•จ์œผ๋กœ์จ ๋ชจ๋ธ์ด ์ธก๋ฉด๊ณผ ๊ด€๋ จ๋œ ์š”์•ฝ๋ฌธ์„ ์ƒ์„ฑํ•˜๋„๋ก ํ–ˆ๋‹ค. ํ‰๊ฐ€๋ฅผ ์œ„ํ•ด์„œ, ๋ฐœ์ƒ ํ˜„ํ™ฉ๊ณผ ๊ด€๋ จ ๋Œ€์‘์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ์ธก๋ฉด์„ ๊ฐ€์ง€๋Š” COVID-19 ๊ด€๋ จ ๊ธฐ์‚ฌ๋กœ ๊ตฌ์„ฑ๋œ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์…‹์„ ๊ตฌ์ถ•ํ•˜์˜€๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•๋“ค์€ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ์…‹์— ๋Œ€ํ•˜์—ฌ ๊ธฐ์กด ๋ชจ๋ธ๋ณด๋‹ค ๋” ์ข‹์€ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์€ ๋ฐœ์ƒ ํ˜„ํ™ฉ ์ธก๋ฉด์—์„œ๋Š” 3.6~4.3%๋กœ ๋†’์€ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๊ฐ€์ ธ์™”์œผ๋ฉฐ, ๊ด€๋ จ ๋Œ€์‘ ์ธก๋ฉด์—์„œ๋Š” 1%๋ฏธ๋งŒ์˜ ํ–ฅ์ƒ์œผ๋กœ, ๋น„๊ต์  ๋‚ฎ์€ ํšจ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ํ•˜์ง€๋งŒ ๋‘ ์ธก๋ฉด ๋ชจ๋‘์—์„œ ์˜ค๋‹ต์ด๋ผ ํ•˜๋”๋ผ๋„ ์ธก๋ฉด๊ณผ ๊ด€๋ จ๋œ ๋ฌธ์žฅ์„ ์„ ํƒํ•˜๋Š” ๊ฒƒ์„ ๊ด€์ฐฐํ–ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด, ์ œ์•ˆ๋œ ๋ฐฉ๋ฒ•์ด ๋ชจ๋ธ์˜ ์ธก๋ฉด ์ง€ํ–ฅ ์š”์•ฝ์— ๋„์›€์„ ์ฃผ์—ˆ์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.1. Introduction 1 1.1. Background 1 1.2. Task Description 3 1.2.1. Text Summarization 3 1.2.2. Aspect Oriented Summarization 4 2. Related Works 7 2.1. Extractive Summarization 7 2.2. Aspect Oriented Summarization 10 2.3. AOSUMM 11 3. Materials and Method 13 3.1. Dataset for Training 13 3.2. Dataset for Evaluation 13 3.2.1. Aspect Definition 14 3.2.2. Annotation 15 3.3. Evaluation Metric 17 3.4. Keyword Selection 19 3.5. Method 21 3.5.1. Extraction of keywords feature 23 3.5.2. Relevance Score 25 3.5.3. Proposed Method 27 4. Results and Discussion 29 4.1. Experiment Settings 29 4.2. Results 30 4.2.1. Automatic Evaluation 30 4.2.2. Qualitative Evaluation 35 4.3. Discussion 38 5. Conclusion 41 Bibliography 43 Appendix 45 Abstract in Korean 51์„

    GreekT5: A Series of Greek Sequence-to-Sequence Models for News Summarization

    Full text link
    Text summarization (TS) is a natural language processing (NLP) subtask pertaining to the automatic formulation of a concise and coherent summary that covers the major concepts and topics from one or multiple documents. Recent advancements in deep learning have led to the development of abstractive summarization transformer-based models, which outperform classical approaches. In any case, research in this field focuses on high resource languages such as English, while the corresponding work for low resource languages is still underdeveloped. Taking the above into account, this paper proposes a series of novel TS models for Greek news articles. The proposed models were thoroughly evaluated on the same dataset against GreekBART, which is the state-of-the-art model in Greek abstractive news summarization. Our evaluation results reveal that most of the proposed models significantly outperform GreekBART on various evaluation metrics. We make our evaluation code public, aiming to increase the reproducibility of this work and facilitate future research in the field.Comment: 26 pages, 0 figure

    How Ready are Pre-trained Abstractive Models and LLMs for Legal Case Judgement Summarization?

    Full text link
    Automatic summarization of legal case judgements has traditionally been attempted by using extractive summarization methods. However, in recent years, abstractive summarization models are gaining popularity since they can generate more natural and coherent summaries. Legal domain-specific pre-trained abstractive summarization models are now available. Moreover, general-domain pre-trained Large Language Models (LLMs), such as ChatGPT, are known to generate high-quality text and have the capacity for text summarization. Hence it is natural to ask if these models are ready for off-the-shelf application to automatically generate abstractive summaries for case judgements. To explore this question, we apply several state-of-the-art domain-specific abstractive summarization models and general-domain LLMs on Indian court case judgements, and check the quality of the generated summaries. In addition to standard metrics for summary quality, we check for inconsistencies and hallucinations in the summaries. We see that abstractive summarization models generally achieve slightly higher scores than extractive models in terms of standard summary evaluation metrics such as ROUGE and BLEU. However, we often find inconsistent or hallucinated information in the generated abstractive summaries. Overall, our investigation indicates that the pre-trained abstractive summarization models and LLMs are not yet ready for fully automatic deployment for case judgement summarization; rather a human-in-the-loop approach including manual checks for inconsistencies is more suitable at present.Comment: Accepted at the 3rd Workshop on Artificial Intelligence and Intelligent Assistance for Legal Professionals in the Digital Workplace (LegalAIIA 2023), in conjunction with the ICAIL 2023 conferenc

    Socratic Pretraining: Question-Driven Pretraining for Controllable Summarization

    Full text link
    In long document controllable summarization, where labeled data is scarce, pretrained models struggle to adapt to the task and effectively respond to user queries. In this paper, we introduce Socratic pretraining, a question-driven, unsupervised pretraining objective specifically designed to improve controllability in summarization tasks. By training a model to generate and answer relevant questions in a given context, Socratic pretraining enables the model to more effectively adhere to user-provided queries and identify relevant content to be summarized. We demonstrate the effectiveness of this approach through extensive experimentation on two summarization domains, short stories and dialogue, and multiple control strategies: keywords, questions, and factoid QA pairs. Our pretraining method relies only on unlabeled documents and a question generation system and outperforms pre-finetuning approaches that use additional supervised data. Furthermore, our results show that Socratic pretraining cuts task-specific labeled data requirements in half, is more faithful to user-provided queries, and achieves state-of-the-art performance on QMSum and SQuALITY.Comment: To appear at ACL 202

    Scientific Opinion Summarization: Meta-review Generation with Checklist-guided Iterative Introspection

    Full text link
    Opinions in the scientific domain can be divergent, leading to controversy or consensus among reviewers. However, current opinion summarization datasets mostly focus on product review domains, which do not account for this variability under the assumption that the input opinions are non-controversial. To address this gap, we propose the task of scientific opinion summarization, where research paper reviews are synthesized into meta-reviews. To facilitate this task, we introduce a new ORSUM dataset covering 10,989 paper meta-reviews and 40,903 paper reviews from 39 conferences. Furthermore, we propose the Checklist-guided Iterative Introspection (CGI2^2) approach, which breaks down the task into several stages and iteratively refines the summary under the guidance of questions from a checklist. We conclude that (1) human-written summaries are not always reliable since many do not follow the guidelines, and (2) the combination of task decomposition and iterative self-refinement shows promising discussion involvement ability and can be applied to other complex text generation using black-box LLM

    SEQ^3: Differentiable Sequence-to-Sequence-to-Sequence Autoencoder for Unsupervised Abstractive Sentence Compression

    Get PDF
    Neural sequence-to-sequence models are currently the dominant approach in several natural language processing tasks, but require large parallel corpora. We present a sequence-to-sequence-to-sequence autoencoder (SEQ^3), consisting of two chained encoder-decoder pairs, with words used as a sequence of discrete latent variables. We apply the proposed model to unsupervised abstractive sentence compression, where the first and last sequences are the input and reconstructed sentences, respectively, while the middle sequence is the compressed sentence. Constraining the length of the latent word sequences forces the model to distill important information from the input. A pretrained language model, acting as a prior over the latent sequences, encourages the compressed sentences to be human-readable. Continuous relaxations enable us to sample from categorical distributions, allowing gradient-based optimization, unlike alternatives that rely on reinforcement learning. The proposed model does not require parallel text-summary pairs, achieving promising results in unsupervised sentence compression on benchmark datasets.Comment: Accepted to NAACL 201
    • โ€ฆ
    corecore