Semantic legal metadata provides information that helps with understanding
and interpreting legal provisions. Such metadata is therefore important for the
systematic analysis of legal requirements. However, manually enhancing a large
legal corpus with semantic metadata is prohibitively expensive. Our work is
motivated by two observations: (1) the existing requirements engineering (RE)
literature does not provide a harmonized view on the semantic metadata types
that are useful for legal requirements analysis; (2) automated support for the
extraction of semantic legal metadata is scarce, and it does not exploit the
full potential of artificial intelligence technologies, notably natural
language processing (NLP) and machine learning (ML). Our objective is to take
steps toward overcoming these limitations. To do so, we review and reconcile
the semantic legal metadata types proposed in the RE literature. Subsequently,
we devise an automated extraction approach for the identified metadata types
using NLP and ML. We evaluate our approach through two case studies over the
Luxembourgish legislation. Our results indicate a high accuracy in the
generation of metadata annotations. In particular, in the two case studies, we
were able to obtain precision scores of 97.2% and 82.4% and recall scores of
94.9% and 92.4%