159 research outputs found

    Hybrid Refining Approach of PrOnto Ontology

    Get PDF
    This paper presents a refinement of PrOnto ontology using a validation test based on legal experts’ annotation of privacy policies combined with an Open Knowledge Extraction (OKE) algorithm. To ensure robustness of the results while preserving an interdisciplinary approach, the integration of legal and technical knowledge has been carried out as follows. The set of privacy policies was first analysed by the legal experts to discover legal concepts and map the text into PrOnto. The mapping was then provided to computer scientists to perform the OKE analysis. Results were validated by the legal experts, who provided feedbacks and refinements (i.e. new classes and modules) of the ontology according to MeLOn methodology. Three iterations were performed on a set of (development) policies, and a final test using a new set of privacy policies. The results are 75,43% of detection of concepts in the policy texts and an increase of roughly 33% in the accuracy gain on the test set, using the new refined version of PrOnto enriched with SKOS-XL lexicon terms and definitions

    Legal Knowledge Extraction for Knowledge Graph Based Question-Answering

    Get PDF
    This paper presents the Open Knowledge Extraction (OKE) tools combined with natural language analysis of the sentence in order to enrich the semantic of the legal knowledge extracted from legal text. In particular the use case is on international private law with specific regard to the Rome I Regulation EC 593/2008, Rome II Regulation EC 864/2007, and Brussels I bis Regulation EU 1215/2012. A Knowledge Graph (KG) is built using OKE and Natural Language Processing (NLP) methods jointly with the main ontology design patterns defined for the legal domain (e.g., event, time, role, agent, right, obligations, jurisdiction). Using critical questions, underlined by legal experts in the domain, we have built a question answering tool capable to support the information retrieval and to answer to these queries. The system should help the legal expert to retrieve the relevant legal information connected with topics, concepts, entities, normative references in order to integrate his/her searching activities

    The Role of Vocabulary Mediation to Discover and Represent Relevant Information in Privacy Policies

    Get PDF
    To date, the effort made by existing vocabularies to provide a shared representation of the data protection domain is not fully exploited. Different natural language processing (NLP) techniques have been applied to the text of privacy policies without, however, taking advantage of existing vocabularies to provide those documents with a shared semantic superstructure. In this paper we show how a recently released domain-specific vocabulary, i.e. the Data Privacy Vocabulary (DPV), can be used to discover, in privacy policies, the information that is relevant with respect to the concepts modelled in the vocabulary itself. We also provide a machine-readable representation of this information to bridge the unstructured textual information to the formal taxonomy modelled in it. This is the first approach to the automatic processing of privacy policies that relies on the DPV, fuelling further investigation on the applicability of existing semantic resources to promote the reuse of information and the interoperability between systems in the data protection domain

    Legal knowledge extraction in the data protection domain based on Ontology Design Patterns

    Get PDF
    In the European Union, the entry into force of the General Data Protection Regulation (GDPR) has brought the domain of data protection to the fore-front, encouraging the research in knowledge representation and natural language processing (NLP). On the one hand, several ontologies adopted Semantic Web standards to provide a formal representation of the data protection framework set by the GDPR. On the other hand, different NLP techniques have been utilised to implement services addressed to individuals, for helping them in understanding privacy policies, which are notoriously difficult to read. Few efforts have been devoted to the mapping of the information extracted from privacy policies to the conceptual representations provided by the existing ontologies modelling the data protection framework. In the first part of the thesis, I propose and put in the context of the Semantic Web a comparative analysis of existing ontologies that have been developed to model different legal fields. In the second part of the thesis, I focus on the data protection domain and I present a methodology that aims to fill the gap between the multitude of ontologies released to model the data protection framework and the disparate approaches proposed to automatically process the text of privacy policies. The methodology relies on the notion of Ontology Design Pattern (ODP), i.e. a modelling solution to solve a recurrent ontology design problem. Implementing a pipeline that exploits existing vocabularies and different NLP techniques, I show how the information disclosed in privacy policies could be extracted and modelled through some existing ODPs. The benefit of such an approach is the provision of a methodology for processing privacy policies texts that overlooks the different ontological models. Instead, it uses ODPs as a semantic middle-layer of processing that different ontological models could refine and extend according to their own ontological commitments

    JURI SAYS:An Automatic Judgement Prediction System for the European Court of Human Rights

    Get PDF
    In this paper we present the web platform JURI SAYS that automatically predicts decisions of the European Court of Human Rights based on communicated cases, which are published by the court early in the proceedings and are often available many years before the final decision is made. Our system therefore predicts future judgements of the court. The platform is available at jurisays.com and shows the predictions compared to the actual decisions of the court. It is automatically updated every month by including the prediction for the new cases. Additionally, the system highlights the sentences and paragraphs that are most important for the prediction (i.e. violation vs. no violation of human rights)

    DaPIS: an Ontology-Based Data Protection Icon Set

    Get PDF
    Privacy policies are known to be impenetrable and lengthy texts that are hardly read and poorly understood. This is why the General Data Protection Regulation (GDPR) introduces provisions to enhance information transparency including icons as visual means to clarify data practices. However, the research on the creation and evaluation of graphical symbols for the communication of legal concepts, which are generally abstract and unfamiliar to laypeople, is still in its infancy. Moreover, detailed visual representations can support users’ comprehension of the underlying concepts, but at the expense of simplicity and usability. This Chapter describes a methodology for the creation and evaluation of DaPIS, a machine-readable Data Protection Icon Set that was designed following human-centered methods drawn from the emerging discipline of Legal Design. Participatory design methods have ensured that the perspectives of legal experts, designers and other relevant stake- holders are combined in a fruitful dialogue, while user studies have empirically determined strengths and weaknesses of the icon set as communicative means for the legal sphere. Inputs from other disciplines were also fundamental: canonical principles drawn from aesthetics, ergonomics and semiotics were included in the methodology. Moreover, DaPIS is modeled on PrOnto, an ontology of the GDPR, thus offering a comprehensive solution for the Semantic Web. In combination with the description of a privacy policy in the legal standard XML Akoma Ntoso, such an approach makes the icons machine-readable and automatically retrievable. Icons can thus serve as information markers in lengthy privacy statements and support an efficient navigation of the document. In this way, different representations of legal information can be mapped and connected to enhance its comprehensibility: the lawyer-readable, the machine-readable, and the human-readable layers

    Image Understanding by Socializing the Semantic Gap

    Get PDF
    Several technological developments like the Internet, mobile devices and Social Networks have spurred the sharing of images in unprecedented volumes, making tagging and commenting a common habit. Despite the recent progress in image analysis, the problem of Semantic Gap still hinders machines in fully understand the rich semantic of a shared photo. In this book, we tackle this problem by exploiting social network contributions. A comprehensive treatise of three linked problems on image annotation is presented, with a novel experimental protocol used to test eleven state-of-the-art methods. Three novel approaches to annotate, under stand the sentiment and predict the popularity of an image are presented. We conclude with the many challenges and opportunities ahead for the multimedia community

    Enterprise Data Warehouse based on Data Vault 2.0 sourced by a Data Lake: A Banking Industry Use Case

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe increasing volume, speed and heterogeneity of data has led companies to invest in Big Data technology, such as Data Lakes, which are central data repositories capable of ingesting structured and unstructured data in a large scale. However, Data Lakes are still not suitable for typical Business Intelligence use cases and analyses as data is stored without a defined schema, which is why most companies still want to keep their existing Enterprise Data Warehouses (EDW). Regarding architectures that combine a Data Lake and an EDW, there are no defined best practices for data storage, metadata management, and data loading from the Data Lake into the EDW, particularly into those based on Data Vault 2.0. There is also a need to understand the impact that a Delta Lake layer can have in optimizing said data loading. This dissertation aims to fill these gaps in the literature and provide the scientific community and banking industry with an efficient architecture for a Data Lake that sources an EDW, and an EDW model based on Data Vault 2.0
    • …
    corecore