13,194 research outputs found
Automatic Text Summarization of Legal Cases: A Hybrid Approach
Manual Summarization of large bodies of text involves a lot of human effort
and time, especially in the legal domain. Lawyers spend a lot of time preparing
legal briefs of their clients' case files. Automatic Text summarization is a
constantly evolving field of Natural Language Processing(NLP), which is a
subdiscipline of the Artificial Intelligence Field. In this paper a hybrid
method for automatic text summarization of legal cases using k-means clustering
technique and tf-idf(term frequency-inverse document frequency) word vectorizer
is proposed. The summary generated by the proposed method is compared using
ROGUE evaluation parameters with the case summary as prepared by the lawyer for
appeal in court. Further, suggestions for improving the proposed method are
also presented.Comment: Part of 5th International Conference on Natural Language Processing
(NATP 2019) Proceeding
How Ready are Pre-trained Abstractive Models and LLMs for Legal Case Judgement Summarization?
Automatic summarization of legal case judgements has traditionally been
attempted by using extractive summarization methods. However, in recent years,
abstractive summarization models are gaining popularity since they can generate
more natural and coherent summaries. Legal domain-specific pre-trained
abstractive summarization models are now available. Moreover, general-domain
pre-trained Large Language Models (LLMs), such as ChatGPT, are known to
generate high-quality text and have the capacity for text summarization. Hence
it is natural to ask if these models are ready for off-the-shelf application to
automatically generate abstractive summaries for case judgements. To explore
this question, we apply several state-of-the-art domain-specific abstractive
summarization models and general-domain LLMs on Indian court case judgements,
and check the quality of the generated summaries. In addition to standard
metrics for summary quality, we check for inconsistencies and hallucinations in
the summaries. We see that abstractive summarization models generally achieve
slightly higher scores than extractive models in terms of standard summary
evaluation metrics such as ROUGE and BLEU. However, we often find inconsistent
or hallucinated information in the generated abstractive summaries. Overall,
our investigation indicates that the pre-trained abstractive summarization
models and LLMs are not yet ready for fully automatic deployment for case
judgement summarization; rather a human-in-the-loop approach including manual
checks for inconsistencies is more suitable at present.Comment: Accepted at the 3rd Workshop on Artificial Intelligence and
Intelligent Assistance for Legal Professionals in the Digital Workplace
(LegalAIIA 2023), in conjunction with the ICAIL 2023 conferenc
Generating indicative-informative summaries with SumUM
We present and evaluate SumUM, a text summarization system that takes a raw technical text as input and produces an indicative informative summary. The indicative part of the summary identifies the topics of the document, and the informative part elaborates on some of these topics according to the reader's interest. SumUM motivates the topics, describes entities, and defines concepts. It is a first step for exploring the issue of dynamic summarization. This is accomplished through a process of shallow syntactic and semantic analysis, concept identification, and text regeneration. Our method was developed through the study of a corpus of abstracts written by professional abstractors. Relying on human judgment, we have evaluated indicativeness, informativeness, and text acceptability of the automatic summaries. The results thus far indicate good performance when compared with other summarization technologies
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
- …