5,059 research outputs found

    On Constructing Persistent Identifiers with Persistent Resolution Targets

    Get PDF
    Persistent Identifiers (PID) are the foundation referencing digital assets in scientific publications, books, and digital repositories. In its realization, PIDs contain metadata and resolving targets in form of URLs that point to data sets located on the network. In contrast to PIDs, the target URLs are typically changing over time; thus, PIDs need continuous maintenance -- an effort that is increasing tremendously with the advancement of e-Science and the advent of the Internet-of-Things (IoT). Nowadays, billions of sensors and data sets are subject of PID assignment. This paper presents a new approach of embedding location independent targets into PIDs that allows the creation of maintenance-free PIDs using content-centric network technology and overlay networks. For proving the validity of the presented approach, the Handle PID System is used in conjunction with Magnet Link access information encoding, state-of-the-art decentralized data distribution with BitTorrent, and Named Data Networking (NDN) as location-independent data access technology for networks. Contrasting existing approaches, no green-field implementation of PID or major modifications of the Handle System is required to enable location-independent data dissemination with maintenance-free PIDs.Comment: Published IEEE paper of the FedCSIS 2016 (SoFAST-WS'16) conference, 11.-14. September 2016, Gdansk, Poland. Also available online: http://ieeexplore.ieee.org/document/7733372

    Machine Understandable Contracts with Deep Learning

    Get PDF
    This research investigates the automatic translation of contracts to computer understandable rules trough Natural Language Processing. The most challenging aspect, which is studied throughout this paper, is to understand the meaning of the contract and express it into a structured format. This problem can be reduced to the Named Entity Recognition and Rule Extraction tasks, the latter handles the extraction of terms and conditions. These two problems are difficult, but deep learning models can tackle them. We think that this paper is the first work to approach Rule Extraction with deep learning. This method is data-hungry, so the research also introduces data sets for these two tasks. Additionally, it contributes to the literature by introducing Law-Bert, a model based on BERT which is pre-trained on unlabelled contracts. The results obtained on Named Entity Recognition and Rule Extraction show that pre-training on contracts has a positive effect on performance for the downstream tasks

    Nordic LifeWatch cooperation, final report: A joint initiative from Denmark, Iceland, Finland, Norway and Sweden

    Get PDF
    The main goal of the present report is to outline the possibilities for an enhanced cooperation between the Nordic countries within eScience and biodiversity. LifeWatch is one of several ESFRI projects which aim to establish eInfrastructures and databases in the field of biodiversity and ecosystem research. Similarities between Nordic countries are extensive in relation to a number of biodiversity related issues. Most species in Nordic countries are common, and frequently the same challenges concerning biodiversity and ecosystem services are addressed in the different countries. The present report has been developed by establishing a Nordic LifeWatch network with delegates from each of the Nordic countries. The report has been written jointly by the delegates, and the work was organized by establishing working groups with the following themes: strategic issues, technical development, legal framework and communication. Written during two workshops, Skype meetings and emailing, the following main issues are discussed in the present report: * Scientific needs for improved access to biodiversity data and advanced eScience research infrastructure in the Nordic countries. * Future challenges and priorities facing the international biodiversity research community. * Scientific potential of openly accessible biodiversity and environmental data for individual researchers and institutions. * Spin-off effects of open access for the general public. * Internationally standardized Nordic metadata inventory. * Legal framework and challenges associated with environmental-, climate-, and biodiversity data sharing, communication, training and scientific needs. * Finally, some strategic steps towards realizing a Nordic LifeWatch construction and operational phase are discussed. Easy access to open data on biodiversity and the environment is crucial for many researchers and research institutions, as well as environmental administration. Easy access to data from different fields of science creates an environment for new scientific ideas to emerge. This potential of generating new, interdisciplinary approaches to pre-existing problems is one of the key features of open-access data platforms that unify diverse data sources. Interdisciplinary elements, access to data over larger gradients, compatible eSystems and eTools to handle large amounts of data are extremely important and, if further developed, represent significant steps towards analysis of biological effects of climate change, human impact and development of operational ecosystem service assessment techniques. It is concluded that significant benefits regarding both scientific potential, technical developments and financial investments can be obtained by constructing a common Nordic LifeWatch eInfrastructure. Several steps concerning organizing and funding of a future Nordic LifeWatch are discussed, and an action plan towards 2020 is suggested. To analyze the potential for future Nordic LifeWatch in detail, our main conclusion is to arrange a Nordic LifeWatch conference as soon as possible. This conference should involve Nordic research councils, scientists and relevant stakeholders. The national delegates from the participating countries in the Nordic LifeWatch project are prepared to present details from the report and developments so far as a basis for further development of Nordic LifeWatch. The present work is financed by NordForsk and in-kind contributions from participating institutions

    Structural templates for comparative protein docking

    Get PDF
    Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, non-redundant library of templates containing 4,950 full structures of binary complexes and 5,936 protein-protein interfaces extracted from the full structures at 12Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu

    COVID19 Disease Map, a computational knowledge repository of virus-host interaction mechanisms.

    Get PDF
    Funder: Bundesministerium für Bildung und ForschungFunder: Bundesministerium für Bildung und Forschung (BMBF)We need to effectively combine the knowledge from surging literature with complex datasets to propose mechanistic models of SARS-CoV-2 infection, improving data interpretation and predicting key targets of intervention. Here, we describe a large-scale community effort to build an open access, interoperable and computable repository of COVID-19 molecular mechanisms. The COVID-19 Disease Map (C19DMap) is a graphical, interactive representation of disease-relevant molecular mechanisms linking many knowledge sources. Notably, it is a computational resource for graph-based analyses and disease modelling. To this end, we established a framework of tools, platforms and guidelines necessary for a multifaceted community of biocurators, domain experts, bioinformaticians and computational biologists. The diagrams of the C19DMap, curated from the literature, are integrated with relevant interaction and text mining databases. We demonstrate the application of network analysis and modelling approaches by concrete examples to highlight new testable hypotheses. This framework helps to find signatures of SARS-CoV-2 predisposition, treatment response or prioritisation of drug candidates. Such an approach may help deal with new waves of COVID-19 or similar pandemics in the long-term perspective

    DRIVER Technology Watch Report

    Get PDF
    This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field
    corecore