910 research outputs found

    Unsupervised and supervised text similarity systems for automated identification of national implementing measures of European directives

    Get PDF
    The automated identification of national implementations (NIMs) of European directives by text similarity techniques has shown promising preliminary results. Previous works have proposed and utilized unsupervised lexical and semantic similarity techniques based on vector space models, latent semantic analysis and topic models. However, these techniques were evaluated on a small multilingual corpus of directives and NIMs. In this paper, we utilize word and paragraph embedding models learned by shallow neural networks from a multilingual legal corpus of European directives and national legislation (from Ireland, Luxembourg and Italy) to develop unsupervised semantic similarity systems to identify transpositions. We evaluate these models and compare their results with the previous unsupervised methods on a multilingual test corpus of 43 Directives and their corresponding NIMs. We also develop supervised machine learning models to identify transpositions and compare their performance with different feature sets

    Automated Identification of National Implementations of European Union Directives With Multilingual Information Retrieval Based On Semantic Textual Similarity

    Get PDF
    The effective transposition of European Union (EU) directives into Member States is important to achieve the policy goals defined in the Treaties and secondary legislation. National Implementing Measures (NIMs) are the legal texts officially adopted by the Member States to transpose the provisions of an EU directive. The measures undertaken by the Commission to monitor NIMs are time-consuming and expensive, as they resort to manual conformity checking studies and legal analysis. In this thesis, we developed a legal information retrieval system using semantic textual similarity techniques to automatically identify the transposition of EU directives into the national law at a fine-grained provision level. We modeled and developed various text similarity approaches such as lexical, semantic, knowledge-based, embeddings-based and concept-based methods. The text similarity systems utilized both textual features (tokens, N-grams, topic models, word and paragraph embeddings) and semantic knowledge from external knowledge bases (EuroVoc, IATE and Babelfy) to identify transpositions. This thesis work also involved the development of a multilingual corpus of 43 directives and their corresponding NIMs from Ireland (English legislation), Italy (Italian legislation) and Luxembourg (French legislation) to validate the text similarity based information retrieval system. A gold standard mapping (prepared by two legal researchers) between directive articles and NIM provisions was prepared to evaluate the various text similarity models. The results show that the lexical and semantic text similarity techniques were more effective in identifying transpositions as compared to the embeddings-based techniques. We also observed that the unsupervised text similarity techniques had the best performance in case of the Luxembourg Directive-NIM corpus. We also developed a concept recognition system based on conditional random fields (CRFs) to identify concepts in European directives and national legislation. The results indicate that the concept recognitions system improved over the dictionary lookup program by tagging the concepts which were missed by dictionary lookup. The concept recognition system was extended to develop a concept-based text similarity system using word-sense disambiguation and dictionary concepts. The performance of the concept-based text similarity measure was competitive with the best performing text similarity measure. The labeled corpus of 43 directives and their corresponding NIMs was utilized to develop supervised text similarity systems by using machine learning classifiers. We modeled three machine learning classifiers with different textual features to identify transpositions. The results show that support vector machines (SVMs) with term frequency-inverse document frequency (TF-IDF) features had the best overall performance over the multilingual corpus. Among the unsupervised models, the best performance was achieved by TF-IDF Cosine similarity model with macro average F-score of 0.8817, 0.7771 and 0.6997 for the Luxembourg, Italian and Irish corpus respectively. These results demonstrate that the system was able to identify transpositions in different national jurisdictions with a good performance. Thus, it has the potential to be useful as a support tool for legal practitioners and Commission officials involved in the transposition monitoring process

    Automated Identification of National Implementations of European Union Directives with Multilingual Information Retrieval based on Semantic Textual Similarity

    Get PDF
    The effective transposition of European Union (EU) directives into Member States is important to achieve the policy goals defined in the Treaties and secondary legislation. National Implementing Measures (NIMs) are the legal texts officially adopted by the Member States to transpose the provisions of an EU directive. The measures undertaken by the Commission to monitor NIMs are time-consuming and expensive, as they resort to manual conformity checking studies and legal analysis. In this thesis, we developed a legal information retrieval system using semantic textual similarity techniques to automatically identify the transposition of EU directives into the national law at a fine-grained provision level. We modeled and developed various text similarity approaches such as lexical, semantic, knowledge-based, embeddings-based and concept-based methods. The text similarity systems utilized both textual features (tokens, N-grams, topic models, word and paragraph embeddings) and semantic knowledge from external knowledge bases (EuroVoc, IATE and Babelfy) to identify transpositions. This thesis work also involved the development of a multilingual corpus of 43 directives and their corresponding NIMs from Ireland (English legislation), Italy (Italian legislation) and Luxembourg (French legislation) to validate the text similarity based information retrieval system. A gold standard mapping (prepared by two legal researchers) between directive articles and NIM provisions was prepared to evaluate the various text similarity models. The results show that the lexical and semantic text similarity techniques were more effective in identifying transpositions as compared to the embeddings-based techniques. We also observed that the unsupervised text similarity techniques had the best performance in case of the Luxembourg Directive-NIM corpus

    An overview of information extraction techniques for legal document analysis and processing

    Get PDF
    In an Indian law system, different courts publish their legal proceedings every month for future reference of legal experts and common people. Extensive manual labor and time are required to analyze and process the information stored in these lengthy complex legal documents. Automatic legal document processing is the solution to overcome drawbacks of manual processing and will be very helpful to the common man for a better understanding of a legal domain. In this paper, we are exploring the recent advances in the field of legal text processing and provide a comparative analysis of approaches used for it. In this work, we have divided the approaches into three classes NLP based, deep learning-based and, KBP based approaches. We have put special emphasis on the KBP approach as we strongly believe that this approach can handle the complexities of the legal domain well. We finally discuss some of the possible future research directions for legal document analysis and processing

    Enhancing access to EU law: why bother?

    Get PDF
    In the past years access to EU law has been significantly enhanced via services such as EUR-Lex. This development not only allows for easy retrieval of individual legal acts, but for collecting information about the evolution of EU law in the aggregate as well. This contribution argues that by charting and analysing the evolution of the body of EU law over time, we can understand better the nature and development of the EU as a political system. The text examines the legislative productivity of the EU over the past 15 years as an illustration. Further, it showcases recent examples of the use of novel data-analytic techniques to analyse the body of EU law for the purposes of understanding the EU legal system, the institutions, and the polity that produced the legal acts. The contribution concludes by arguing that it is important to transmit basic facts and insights about the evolution of EU law and law-making to the general public as well, in order to counter the threat of Euroscepticism and perceptions of democratic deficit in the EU.The politics and administration of institutional chang

    Can Machine Learning, as a RegTech Compliance Tool, lighten the Regulatory Burden for Charitable Organisations in the United Kingdom?

    Get PDF
    Purpose: The purpose of this article is to explore the extent to which machine learning can be used as solution to lighten the compliance and regulatory burden on charitable organisations in the United Kingdom. Design/methodology/approach: The subject is approached through the analysis of data, literature, and domestic and international regulation. The first part of the article summarises the extent of current regulatory obligations faced by charities, these are then, in the second part, set against the potential technological solutions provided by machine learning as at July 2021. Findings: It is suggested that charities can utilise machine learning as a smart technological solution to ease the regulatory burden they face in a growing and impactful sector. Originality: The work is original because it is the first to specifically explore how machine learning as a technological advance can assist charities in meeting the regulatory compliance challenge

    Discrimination-aware data analysis for criminal intelligence

    Get PDF
    The growing use of Machine Learning (ML) algorithms in many application domains such as healthcare, business, education and criminal justice has evolved great promises as well challenges. ML pledges in proficiently analysing a large amount of data quickly and effectively by identifying patterns and providing insight into the data, which otherwise would have been impossible for a human to execute in this scale. However, the use of ML algorithms, in sensitive domains such as the Criminal Intelligence Analysis (CIA) system, demands extremely careful deployment. Data has an important impact in ML process. To understand the ethical and privacy issues related to data and ML, the VALCRI (Visual Analytics for sense-making in the CRiminal Intelligence analysis) system was used . VALCRI is a CIA system that integrated machine-learning techniques to improve the effectiveness of crime data analysis. At the most basic level, from our research, it was found that lack of harmonised interpretation of different privacy principles, trade-offs between competing ethical principles, and algorithmic opacity as concerning ethical and privacy issues among others. This research aims to alleviate these issues by investigating awareness of ethical and privacy issues related to data and ML. Document analysis and interviews were conducted to examine the way different privacy principles were understood in selected EU countries. The study takes a qualitative and quantitative research approach and is guided by various methods of analysis including interviews, observation, case study, experiment and legal document analysis. The findings of this research indicate that a lack of ethical awareness on data has an impact on ML outcome. Also, due to the opaque nature of the ML system, it is difficult to scrutinize and as a consequence, it leads to a lack of clarity in terms of how certain decisions were made. This thesis provides some novel solutions that can be used to tackle these issues

    Algorithmic Enforcement of Copyright Online and the Global Example of EU

    Get PDF
    Today algorithms are present virtually in all aspects of life, and protecting online copyrighted content against infringement is no exception. Through a legal dogmatic and critical analysis, this work studies how algorithms are used on online platforms to protect authors' rights. The essential legislative element in this work is Copyright in Digital Single Market Directive in the EU. The study assesses the impacts of the directive’s article 17 on algorithmic enforcement of copyright by online content-sharing platforms. This includes understanding how already in-use algorithmic technologies and the voluntary ex ante enforcement of copyright, such as YouTube’s Content ID, are affected. Human rights also are one of the primary concerns of this study. Hence, different levels of this work consider associated challenges and the ways it has been or should be remedied by the EU lawmaker. This work also provides the reader with a factual understanding of the techniques and technologies used in copyright algorithmic enforcement. The findings of this work are that despite much criticism, the said directive is able to improve various problems, including those of human rights. Nevertheless, it is discovered that the directive also fails in certain points, such as achieving an all-works-licenced ideal. Furthermore, it is noticed that the practices of online content-sharing platforms do not fully comply with EU law. This study concludes by suggesting increased algorithmic transparency and accountability, ongoing dialogues and collaboration, expanding participation of the public and NGOs, as well as encouraging platforms to use more complex AI technologies, such as NLP, in enforcing copyright
    • …
    corecore