9,037 research outputs found
CDJUR-BR -- A Golden Collection of Legal Document from Brazilian Justice with Fine-Grained Named Entities
A basic task for most Legal Artificial Intelligence (Legal AI) applications
is Named Entity Recognition (NER). However, texts produced in the context of
legal practice make references to entities that are not trivially recognized by
the currently available NERs. There is a lack of categorization of legislation,
jurisprudence, evidence, penalties, the roles of people in a legal process
(judge, lawyer, victim, defendant, witness), types of locations (crime
location, defendant's address), etc. In this sense, there is still a need for a
robust golden collection, annotated with fine-grained entities of the legal
domain, and which covers various documents of a legal process, such as
petitions, inquiries, complaints, decisions and sentences. In this article, we
describe the development of the Golden Collection of the Brazilian Judiciary
(CDJUR-BR) contemplating a set of fine-grained named entities that have been
annotated by experts in legal documents. The creation of CDJUR-BR followed its
own methodology that aimed to attribute a character of comprehensiveness and
robustness. Together with the CDJUR-BR repository we provided a NER based on
the BERT model and trained with the CDJUR-BR, whose results indicated the
prevalence of the CDJUR-BR.Comment: 15 pages, in Portuguese language, 3 figures, 5 table
Annotation of Fine-Grained Geographical Entities in German Texts
We work on the creation of a corpus, crawled from the internet, on the Berlin district of Moabit, primarily meant for training NER systems in German and English. Typical NER corpora and corresponding systems distinguish persons, organisations and locations, but do not distinguish different types of location entities. For our tourism-inspired use case, we need fine-grained annotations for toponyms. In this paper, we outline the fine-grained classification of geographical entities, the resulting annotations and we present preliminary results on automatically tagging toponyms in a small, bootstrapped gold corpus
Comparative study on Judgment Text Classification for Transformer Based Models
This work involves the usage of various NLP models to predict the winner of a
particular judgment by the means of text extraction and summarization from a
judgment document. These documents are useful when it comes to legal
proceedings. One such advantage is that these can be used for citations and
precedence reference in Lawsuits and cases which makes a strong argument for
their case by the ones using it. When it comes to precedence, it is necessary
to refer to an ample number of documents in order to collect legal points with
respect to the case. However, reviewing these documents takes a long time to
analyze due to the complex word structure and the size of the document. This
work involves the comparative study of 6 different self-attention-based
transformer models and how they perform when they are being tweaked in 4
different activation functions. These models which are trained with 200
judgement contexts and their results are being judged based on different
benchmark parameters. These models finally have a confidence level up to 99%
while predicting the judgment. This can be used to get a particular judgment
document without spending too much time searching relevant cases and reading
them completely.Comment: 28 pages with 9 figure
Named Entity Recognition in Indian court judgments
Identification of named entities from legal texts is an essential building
block for developing other legal Artificial Intelligence applications. Named
Entities in legal texts are slightly different and more fine-grained than
commonly used named entities like Person, Organization, Location etc. In this
paper, we introduce a new corpus of 46545 annotated legal named entities mapped
to 14 legal entity types. The Baseline model for extracting legal named
entities from judgment text is also developed.Comment: to be published in NLLP 2022 Workshop at EMNL
PoliToHFI at SemEval-2023 Task 6: Leveraging Entity-Aware and Hierarchical Transformers For Legal Entity Recognition and Court Judgment Prediction
The use of Natural Language Processing techniques in the legal domain has become established for supporting attorneys and domain experts in content retrieval and decision-making. However, understanding the legal text poses relevant challenges in the recognition of domain-specific entities and the adaptation and explanation of predictive models. This paper addresses the Legal Entity Name Recognition (L-NER) and Court judgment Prediction (CPJ) and Explanation (CJPE) tasks. The L-NER solution explores the use of various transformer-based models, including an entity-aware method attending domain-specific entities. The CJPE proposed method relies on hierarchical BERT-based classifiers combined with local input attribution explainers. We propose a broad comparison of eXplainable AI methodologies along with a novel approach based on NER. For the LNER task, the experimental results remark on the importance of domain-specific pre-training. For CJP our lightweight solution shows performance in line with existing approaches, and our NER-boosted explanations show promising CJPE results in terms of the conciseness of the prediction explanations
- …