184 research outputs found

    Analysis of legal networks

    Get PDF
    This report describes the main electronically available sources of law in the three target countries of the Openlaws.eu project: Austria, the Netherlands and the United Kingdom, plus those of the EU. It describes their strengths and weaknesses in terms of available data, formats and licensing. Since the world is dynamic, especially that of electronic data, the document was originally set up as a set of spreadsheets and a web site that is easier to maintain and update. This deliverable contains a snapshot of the status of these documents at the end of December 2014

    PTPARL-D: Annotated Corpus of 44 years of Portuguese Parliament debates

    Full text link
    In a representative democracy, some decide in the name of the rest, and these elected officials are commonly gathered in public assemblies, such as parliaments, where they discuss policies, legislate, and vote on fundamental initiatives. A core aspect of such democratic processes are the plenary debates, where important public discussions take place. Many parliaments around the world are increasingly keeping the transcripts of such debates, and other parliamentary data, in digital formats accessible to the public, increasing transparency and accountability. Furthermore, some parliaments are bringing old paper transcripts to semi-structured digital formats. However, these records are often only provided as raw text or even as images, with little to no annotation, and inconsistent formats, making them difficult to analyze and study, reducing both transparency and public reach. Here, we present PTPARL-D, an annotated corpus of debates in the Portuguese Parliament, from 1976 to 2019, covering the entire period of Portuguese democracy

    The ParlaMint corpora of parliamentary proceedings

    Get PDF
    This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis

    The Parla-CLARIN Recommendations for Encoding Corpora of Parliamentary Proceedings

    Get PDF
    Parliamentary proceedings are a rich source of data that can be used by scholars in various humanities and social sciences disciplines. Unlike the sources of most other language corpora, parliamentary proceedings are not subject to copyright or personal privacy protections, and are typically available online, thus making them ideal for compilation into corpora and for open distribution. For these reasons many countries have already produced corpora of parliamentary proceedings, but each typically in their own encoding, limiting their comparability and utilization in a multilingual setting. In this paper we propose an encoding schema which could serve as an interchange format for parliamentary corpora compiled for the purposes of scholarly investigations. The schema, called Parla-CLARIN, was developed within the CLARIN research infrastructure, and is written as a TEI ODD which includes a TEI customization and prose guidelines with examples of use. We discuss the coverage and choices made in designing the recommendations, and give an overview of the guidelines. We also discuss two other standard schemas for encoding parliamentary data, Akoma Ntoso and RDF, and their relation to Parla-CLARIN. We conclude by presenting corpora already encoded in Parla-CLARIN and discussing further work, especially the provision of a set of example documents and of transformation scripts that would make the proposed encoding more usable

    CLARIN

    Get PDF
    The book provides a comprehensive overview of the Common Language Resources and Technology Infrastructure – CLARIN – for the humanities. It covers a broad range of CLARIN language resources and services, its underlying technological infrastructure, the achievements of national consortia, and challenges that CLARIN will tackle in the future. The book is published 10 years after establishing CLARIN as an Europ. Research Infrastructure Consortium
    corecore