577 research outputs found

    Multi-language transfer learning for low-resource legal case summarization

    Get PDF
    Analyzing and evaluating legal case reports are labor-intensive tasks for judges and lawyers, who usually base their decisions on report abstracts, legal principles, and commonsense reasoning. Thus, summarizing legal documents is time-consuming and requires excellent human expertise. Moreover, public legal corpora of specific languages are almost unavailable. This paper proposes a transfer learning approach with extractive and abstractive techniques to cope with the lack of labeled legal summarization datasets, namely a low-resource scenario. In particular, we conducted extensive multi- and cross-language experiments. The proposed work outperforms the state-of-the-art results of extractive summarization on the Australian Legal Case Reports dataset and sets a new baseline for abstractive summarization. Finally, syntactic and semantic metrics assessments have been carried out to evaluate the accuracy and the factual consistency of the machine-generated legal summaries

    An overview of information extraction techniques for legal document analysis and processing

    Get PDF
    In an Indian law system, different courts publish their legal proceedings every month for future reference of legal experts and common people. Extensive manual labor and time are required to analyze and process the information stored in these lengthy complex legal documents. Automatic legal document processing is the solution to overcome drawbacks of manual processing and will be very helpful to the common man for a better understanding of a legal domain. In this paper, we are exploring the recent advances in the field of legal text processing and provide a comparative analysis of approaches used for it. In this work, we have divided the approaches into three classes NLP based, deep learning-based and, KBP based approaches. We have put special emphasis on the KBP approach as we strongly believe that this approach can handle the complexities of the legal domain well. We finally discuss some of the possible future research directions for legal document analysis and processing

    Automated Attribute Extraction from Legal Proceedings

    Full text link
    The escalating number of pending cases is a growing concern world-wide. Recent advancements in digitization have opened up possibilities for leveraging artificial intelligence (AI) tools in the processing of legal documents. Adopting a structured representation for legal documents, as opposed to a mere bag-of-words flat text representation, can significantly enhance processing capabilities. With the aim of achieving this objective, we put forward a set of diverse attributes for criminal case proceedings. We use a state-of-the-art sequence labeling framework to automatically extract attributes from the legal documents. Moreover, we demonstrate the efficacy of the extracted attributes in a downstream task, namely legal judgment prediction.Comment: Presented in Mining and Learning in the Legal Domain (MLLD) workshop 202

    Building Legal Case Retrieval Systems with Lexical Matching and Summarization using A Pre-Trained Phrase Scoring Model

    Full text link
    We present our method for tackling the legal case retrieval task of the Competition on Legal Information Extraction/Entailment 2019. Our approach is based on the idea that summarization is important for retrieval. On one hand, we adopt a summarization based model called encoded summarization which encodes a given document into continuous vector space which embeds the summary properties of the document. We utilize the resource of COLIEE 2018 on which we train the document representation model. On the other hand, we extract lexical features on different parts of a given query and its candidates. We observe that by comparing different parts of the query and its candidates, we can achieve better performance. Furthermore, the combination of the lexical features with latent features by the summarization-based method achieves even better performance. We have achieved the state-of-the-art result for the task on the benchmark of the competition

    U-CREAT: Unsupervised Case Retrieval using Events extrAcTion

    Full text link
    The task of Prior Case Retrieval (PCR) in the legal domain is about automatically citing relevant (based on facts and precedence) prior legal cases in a given query case. To further promote research in PCR, in this paper, we propose a new large benchmark (in English) for the PCR task: IL-PCR (Indian Legal Prior Case Retrieval) corpus. Given the complex nature of case relevance and the long size of legal documents, BM25 remains a strong baseline for ranking the cited prior documents. In this work, we explore the role of events in legal case retrieval and propose an unsupervised retrieval method-based pipeline U-CREAT (Unsupervised Case Retrieval using Events Extraction). We find that the proposed unsupervised retrieval method significantly increases performance compared to BM25 and makes retrieval faster by a considerable margin, making it applicable to real-time case retrieval systems. Our proposed system is generic, we show that it generalizes across two different legal systems (Indian and Canadian), and it shows state-of-the-art performance on the benchmarks for both the legal systems (IL-PCR and COLIEE corpora).Comment: Accepted at ACL 2023, 15 pages (12 main + 3 Appendix

    Review of Semantic Importance and Role of using Ontologies in Web Information Retrieval Techniques

    Get PDF
    The Web contains an enormous amount of information, which is managed to accumulate, researched, and regularly used by many users. The nature of the Web is multilingual and growing very fast with its diverse nature of data including unstructured or semi-structured data such as Websites, texts, journals, and files. Obtaining critical relevant data from such vast data with its diverse nature has been a monotonous and challenging task. Simple key phrase data gathering systems rely heavily on statistics, resulting in a word incompatibility problem related to a specific word's inescapable semantic and situation variants. As a result, there is an urgent need to arrange such colossal data systematically to find out the relevant information that can be quickly analyzed and fulfill the users' needs in the relevant context. Over the years ontologies are widely used in the semantic Web to contain unorganized information systematic and structured manner. Still, they have also significantly enhanced the efficiency of various information recovery approaches. Ontological information gathering systems recover files focused on the semantic relation of the search request and the searchable information. This paper examines contemporary ontology-based information extraction techniques for texts, interactive media, and multilingual data types. Moreover, the study tried to compare and classify the most significant developments utilized in the search and retrieval techniques and their major disadvantages and benefits
    • …
    corecore