1,622 research outputs found
Digitometric Services for Open Archives Environments
We describe âdigitometricâ services and tools that add value to open-access eprint archives using the Open Archives Initiative (OAI) Protocol for Metadata Harvesting. Celestial is an OAI cache and gateway tool. Citebase Search enhances OAI-harvested metadata with linked references harvested from the full-text to provide a web service for citation navigation and research impact analysis. Digitometrics builds on data harvested using OAI to provide advanced visualisation and hypertext navigation for the research community. Together these services provide a modular, distributed architecture for building a âsemantic webâ for the research literature
Recognizing cited facts and principles in legal judgements
In common law jurisdictions, legal professionals cite facts and legal principles from precedent cases to support their arguments before the court for their intended outcome in a current case. This practice stems from the doctrine of stare decisis, where cases that have similar facts should receive similar decisions with respect to the principles. It is essential for legal professionals to identify such facts and principles in precedent cases, though this is a highly time intensive task. In this paper, we present studies that demonstrate that human annotators can achieve reasonable agreement on which sentences in legal judgements contain cited facts and principles (respectively, Îș=0.65 and Îș=0.95 for inter- and intra-annotator agreement). We further demonstrate that it is feasible to automatically annotate sentences containing such legal facts and principles in a supervised machine learning framework based on linguistic features, reporting per category precision and recall figures of between 0.79 and 0.89 for classifying sentences in legal judgements as cited facts, principles or neither using a Bayesian classifier, with an overall Îș of 0.72 with the human-annotated gold standard
OpenCitations Meta
OpenCitations Meta is a new database that contains bibliographic metadata of
scholarly publications involved in citations indexed by the OpenCitations
infrastructure. It adheres to Open Science principles and provides data under a
CC0 license for maximum reuse. The data can be accessed through a SPARQL
endpoint, REST APIs, and dumps. OpenCitations Meta serves three important
purposes. Firstly, it enables disambiguation of citations between publications
described using different identifiers from various sources. For example, it can
link publications identified by DOIs in Crossref and PMIDs in PubMed. Secondly,
it assigns new globally persistent identifiers (PIDs), known as OpenCitations
Meta Identifiers (OMIDs), to bibliographic resources without existing external
persistent identifiers like DOIs. Lastly, by hosting the bibliographic metadata
internally, OpenCitations Meta improves the speed of metadata retrieval for
citing and cited documents. The database is populated through automated data
curation, including deduplication, error correction, and metadata enrichment.
The data is stored in RDF format following the OpenCitations Data Model, and
changes and provenance information are tracked. OpenCitations Meta and its
production. OpenCitations Meta currently incorporates data from Crossref,
DataCite, and the NIH Open Citation Collection. In terms of semantic publishing
datasets, it is currently the first in data volume.Comment: 26 pages, 7 figure
Unlocking the potential of public sector information with Semantic Web technology
Governments often hold very rich data and whilst much of this information is published and available for re-use by others, it is often trapped by poor data structures, locked up in legacy data formats or in fragmented databases. One of the great benefits that Semantic Web (SW) technology offers is facilitating the large scale integration and sharing of distributed data sources. At the heart of information policy in the UK, the Office of Public Sector Information (OPSI) is the part of the UK government charged with enabling the greater re-use of public sector information. This paper describes the actions, findings, and lessons learnt from a pilot study, involving several parts of government and the public sector. The aim was to show to government how they can adopt SW technology for the dissemination, sharing and use of its data
- âŠ