Semantic legal metadata provides information that helps with understanding
and interpreting legal provisions. Such metadata is therefore important for the
systematic analysis of legal requirements. However, manually enhancing a large
legal corpus with semantic metadata is prohibitively expensive. Our work is
motivated by two observations: (1) the existing requirements engineering (RE)
literature does not provide a harmonized view on the semantic metadata types
that are useful for legal requirements analysis; (2) automated support for the
extraction of semantic legal metadata is scarce, and it does not exploit the
full potential of artificial intelligence technologies, notably natural
language processing (NLP) and machine learning (ML). Our objective is to take
steps toward overcoming these limitations. To do so, we review and reconcile
the semantic legal metadata types proposed in the RE literature. Subsequently,
we devise an automated extraction approach for the identified metadata types
using NLP and ML. We evaluate our approach through two case studies over the
Luxembourgish legislation. Our results indicate a high accuracy in the
generation of metadata annotations. In particular, in the two case studies, we
were able to obtain precision scores of 97.2% and 82.4% and recall scores of
94.9% and 92.4%

Briand, Lionel

Ceci, Marcello

Dann, John

Sabetzadeh, Mehrdad

Sannier, Nicolas

Sleimi, Amin

English

arXiv

peer reviewedSemantic legal metadata provides information that helps with understanding and interpreting legal provisions. Such metadata is therefore important for the systematic analysis of legal requirements. However, manually enhancing a large legal corpus with semantic metadata is prohibitively expensive. Our work is motivated by two observations: (1) the existing requirements engineering (RE) literature does not provide a harmonized view on the semantic metadata types that are useful for legal requirements analysis; (2) automated support for the extraction of semantic legal metadata is scarce, and it does not exploit the full potential of artificial intelligence technologies, notably natural language processing (NLP) and machine learning (ML). Our objective is to take steps toward overcoming these limitations. To do so, we review and reconcile the semantic legal metadata types proposed in the RE literature. Subsequently, we devise an automated extraction approach for the identified metadata types using NLP and ML. We evaluate our approach through two case studies over the Luxembourgish legislation. Our results indicate a high accuracy in the generation of metadata annotations. In particular, in the two case studies, we were able to obtain precision scores of 97,2% and 82,4%, and recall scores of 94,9% and 92,4%.SCARLE

Open Repository and Bibliography - Luxembourg

An Automated Framework for the Extraction of Semantic Legal Metadata from Legal Texts

https://orbilu.uni.lu/bitstream/10993/46243/1/SSSBCD-EMSE2020Rev.3.pdf

An Automated Framework for the Extraction of Semantic Legal Metadata from Legal Texts

Abstract

Similar works

Full text

Available Versions

Open Repository and Bibliography - Luxembourg