3 research outputs found

    Using ontologies to map between research data and policymakers’ presumptions: the experience of the KNOWMAK project

    Get PDF
    Understanding knowledge co-creation in key emerging areas of European research is critical for policy makers wishing to analyze impact and make strategic decisions. However, purely data-driven methods for characterising policy topics have limitations relating to the broad nature of such topics and the differences in language and topic structure between the political language and scientific and technological outputs. In this paper, we discuss the use of ontologies and semantic technologies as a means to bridge the linguistic and conceptual gap between policy questions and data sources for characterising European knowledge production. Our experience suggests that the integration between advanced techniques for language processing and expert assessment at critical junctures in the process is key for the success of this endeavour

    Automating CIRI Ratings of Human Rights Reports Using Gate

    Get PDF
    This thesis involves parsing document-based reports from the United States Human Rights Reports and rating the human practices for various countries based on the CIRI (Cingranelli-Richards) Human Rights Data Project dataset. The United States Human Rights Reports are annual reports that cover internationally recognized human rights practices regarding individual, civil, political, and worker rights. Students, scholars, policymakers, and analysts used the CIRI data for practical and research purposes. CIRI analyzed the annual reports from 1981 to 2011 and then stopped releasing the dataset for any further years, but a possible reason is due to the manual process of scouring the Human Rights Reports and then rating each human rights practice for each country. This manual process provides a solid foundation for creating a new automated process. The automated process uses the rating values provided by CIRI in the 1981-2011 dataset as expected values to evaluate the accuracy of the rating process. To transition to an automated process, the General Architecture for Text Engineering (GATE) application is used. GATE is an open source project used for developing solutions for text processing. GATE is used in conjunction with the coding schemes provided within the CIRI Coding Manual to create an automated ratings process. The CIRI Coding Manual uses qualitative and quantitative criteria. The original and automated ratings are evaluated using GATE’s Annotation Diff Tool to get the F-measure for every country in the dataset. The evaluation cases range between 1999 and 2011 because those are the only years included in both the CIRI dataset and the Human Rights Reports. The F-measure results are more accurate when quantitative criteria is used to rate human rights practices. The primary contribution of this thesis is a method for automating each country’s human practice ratings so that the purpose of the CIRI project can be continued

    Large Scale Semantic Annotation, Indexing, and Search at The National Archives

    No full text
    This paper describes a tool developed to improve access to the enormous volume of data housed at the UK’s National Archives, both for the general public and for specialist researchers. The system we have developed, TNA-Search, enables a multi-paradigm search over the entire electronic archive (42TB of data in various formats). The search functionality allows queries that arbitrarily mix any combination of full-text, structural, linguistic and semantic queries. The archive is annotated and indexed with respect to a massive semantic knowledge base containing data from the LOD cloud, data.gov.uk, related TNA projects, and a large geographical database. The semantic annotation component achieves approximately 83 % F-measure, which is very reasonable considering the wide range of entities and document types and the open domain. The technologies are being adopted by real users at The National Archives and will form the core of their suite of search tools, with additional in-house interfaces
    corecore