    Spanish Legislation as Linked Data

    Proceedings of the 2nd Workshop on Technologies for Regulatory Compliance co-located with the 31st International Conference on Legal Knowledge and Information Systems, Groningen, NL, 12th of December 2018Legislation is officially published in Spain as HTML, PDF and XML. In the next few months, metadata will also be published as RDF, following the guidelines of the European Legislation Identifier (ELI) and using metadata records supported by the ELI ontology. The work presented here is an independent effort to publish Spanish consolidated legislation strongly linked to other external resources. In the published dataset, text is structured in articles; key terms are related to external terminological databases, named entities are identified, and links between internal and external documents have been automatically identified. The dataset is publicly available in a SPARQL endpoint


    In today’s globalized world, companies are faced with numerous and continuously changing legal requirements. To ensure that these companies are compliant with legal regulations, law and consulting firms use open legal data published by governments worldwide. With this data pool growing rapidly, the complexity of legal research is strongly increasing. Despite this fact, only few research papers consider the application of information systems in the legal domain. Against this backdrop, we pro-pose a knowledge management (KM) system that aims at supporting legal research processes. To this end, we leverage the potentials of text mining techniques to extract valuable information from legal documents. This information is stored in a graph database, which enables us to capture the relation-ships between these documents and users of the system. These relationships and the information from the documents are then fed into a recommendation system which aims at facilitating knowledge transfer within companies. The prototypical implementation of the proposed KM system is based on 20,000 legal documents and is currently evaluated in cooperation with a Big 4 accounting company

    Machine learning in predictive analytics on judicial decision-making

    Legal professionals globally are under pressure to provide ‘more for less' – not an easy challenge in the era of big data, increasingly complex regulatory and legislative frameworks and volatile financial markets. Although largely limited to information retrieval and extraction, Machine Learning applications targeted at the legal domain have to some extent become mainstream. The startup market is rife with legal technology providers with many major law firms encouraging research and development through formal legal technology incubator programs. Experienced legal professionals are expected to become technologically astute as part of their response to the ‘more for less' challenge, while legal professionals on track to enter the legal services industry are encouraged to broaden their skill sets beyond a traditional law degree. Predictive analytics applied to judicial decision-making raise interesting discussions around potential benefits to the general public, over-burdened judicial systems and legal professionals respectively. It is also associated with limitations and challenges around manual input required (in the absence of automatic extraction and prediction) and domain-specific application. While there is no ‘one size fits all' solution when considering predictive analytics across legal domains or different countries' legal systems, this dissertation aims to provide an overview of Machine Learning techniques which could be applied in further research, to start unlocking the benefits associated with predictive analytics on a greater (and hopefully local) scale

    Možnosti citační analýzy v České republice

    Předkládaná studie je prvním výstupem dlouhodobého výzkumu teoretických a praktických možností provedení citační analýzy judikatury v České republice. Vypracování takové analýzy je prvním krokem k vytvoření praktických nástrojů, které přesně dokáží mapovat vztahy mezi jednotlivými rozhodnutími a tím hodnotit jejich důležitost. V úvodní části textu je nabídnuto teoretické shrnutí dosavadních poznatků ve zkoumané oblasti, rešerše relevantní literatury a stručné zhodnocení využitelnosti citační analýzy v českém prostředí. Druhá část shrnuje aktuální status quo práce s judikaturou v České republice. Nejprve se zaměřuje na veřejně dostupné databáze vrcholných soudů (Nejvyšší soud, Nejvyšší správní soud a Ústavní soud), přičemž jsou zkoumány možnosti, které dané databázové systémy nabízejí, tedy zda a případně jakým způsobem jsou metadaty určeny vztahy k citovaným rozhodnutím. Obdobný postup je pak proveden v rámci tří komerčních právních informačních systémů – ASPI, Beck-online a CODEXIS. Třetí část shrnuje prvotní problémy ležící v cestě provedení komplexní citační analýzy, která by splňovala požadavky využitelnosti. Popisuje první krok, spočívající v pokusu o zachycení citací soudních rozhodnutí regulárním výrazem ve vzorku 46 rozhodnutí Ústavního soudu. V závěru textu jsou pak zhodnoceny dosažené výsledky a je popsán další postup výzkumu možností realizace citační analýzy judikatury v České republice.This paper is the first output of long-term research on theoretical and practical possibility of citation analysis of the case law in the Czech Republic. Development of such an analysis is the first step in the creation of practical tools that can accurately map relationships between specific decisions and thus evaluate their importance. In the first part of the paper a general summary of current findings in the area, a recherché of relevant doctrinal sources and brief evaluation of the applicability of citation analysis in the Czech environment is offered. In the second part of the paper the current status quo of the case law publication practice in the Czech Republic is summed up. Firstly, publicly accessible databases of the supreme courts (Supreme Court, Supreme Administrative Court, and Constitutional Court) are reviewed. The possibilities which these database systems offer are examined, e.g. whether and how the relationships between different decisions by their metadata are specified. A similar approach is consequently applied to the three private legal information systems - ASPI, Beckonline, and CODEXIS. In the third part of the paper primary difficulties that stand in the way of conducting the complex citation analysis which would meet the requirements of usability are summed up. The first step consisting in an attempt to capture citations of decisions in the sample of 46 decisions of the Constitutional Court by a regular expression is described. At the end of the paper, results are evaluated, and further approach of the research of the possibility of case law citation analysis is described

    Identifying External Cross-references using Natural Language Processing (NLP)

    [Context and motivation] Software engineers build systems that need to be compliant with relevant regulations. These regulations are stated in authoritative documents from which regulatory requirements need to be elicited. Project contract contains cross-references to these regulatory requirements in external documents. [Problem] Exploring and identifying the regulatory requirements in voluminous textual data is enormously time consuming, and hence costly, and error-prone in sizable software projects. [Principal idea and novelty] We use Natural Language Processing (NLP), Pattern Recognition and Web Scrapping techniques for automatically extracting external cross-references from contractual requirements and prepare a map for representing related external cross-references to each contractual requirement. This map is also automatically extended to the world-wide web using previously identified references that are not located in local resources. The novel aspects in our approach involve: (i) a taxonomy of semantic cues for identifying cross-references, (ii) a taxonomy of grammatical structures for supporting various combinations of word roles in a sentence, (iii) APA standards for validating cross-references, and (iv) third party access for unavailable resources. [Research Contribution] The key research contribution is a tool implementing the mentioned techniques for identifying cross-references in contractual documents and related regulatory documents and the web. The tool produces high-level and detailed views of cross-references amongst documents that can be used by various stakeholders for project management, requirements elicitation, testing, and other purposes. We anticipate that this would save an enormous amount of time and effort needed to do this task manually in contractual projects. [Conclusion] The output cross-references produced by the tool suggests a precision of 99%, and recall of 87% from contractual requirements. Further work is identified