13 research outputs found

    Connecting Repositories to one Integrated Domain

    Get PDF
    Information is the new commodity in the global economy and trustworthy digital repositories will be the key pillars within this new ecosystem. The value of this digital information will only be realised if these repositories can be interacted with in a consistent manner and their data accessible and understandable globally. Establishing a data interoperability layer is the goal of the emerging domain of Digital Objects. When considering how to proceed with designing this interoperability layer, it is important to state that repositories need to be considered from two different perspectives:Repositories are a reflection of the institutions that make them operational (quality of service, skilled experts, accessible over many years, appropriate data management procedures).Repositories are computational services that provide a specific set of functions.Complicating the effort to make repositories accessible and interoperable across the global is that many existing repositories have been developed in the past decades using a wide range of heterogeneous technologies, organisation of data and functionality. Many of these repositories are their own data silos and not interoperable. It is important to realise that much money has been invested to build these repositories and therefore we cannot expect that they will make large changes without great incentives and funding. This heterogeneity is the core of the challenge in making digital information the new commodity in the emerging global domain of digital objects.This paper will focus on the functional aspects of repositories and proposes the FAIR Digital Object model as a core data model for describing digital information and the use of the Digital Object Interface Protocol (DOIP) to establish interoperable communication with all repositories independently of the respective technical choices. It is the conviction of this paper’s authors that this integration of the FDO model and DOIP with existing repositories can be performed with minimal effort and we will present examples that document this claim.We will present three examples of existing integration in this paper:An integration of B2SHAREA CORDRA repositoryIntegration of the DOBES archiveB2SHARE is a repository that has assigned Persistent Identifiers (PIDs) (Handles) to all of its digital files. It allows users to add metadata according to a unified schema, but also has the possibility for user communities to extend this schema. The API allows one to specify a Handle which then gives access to the metadata and/or the bit sequences of the DO. It should be noted that B2SHARE allows one to include a set of bit-sequences being linked with the Handle. The integration consists of building a proxy that would provide a DOIP interface to B2SHARE to streamline the integration of the data and metadata into a single DO. The development of the proxy was relatively simple and did not require any changes on behalf of the B2SHARE repository. CORDRA is a CNRI repository/registry/registration system that manages DO, assigns Handles to all its DOs and is accessible through DOIP. For all intents and purposes, it implements many of the features from the Digital Object Architecture.The integration of the two repositories enables copying files or movíng digital objects. In the case of copying files (metadata and bit sequences) from B2SHARE to CORDRA, for example, all functionality of the CORDRA service such as searching would become possible. Important is that in this case the PID record identifying the digital object in the B2SHARE repository would have to be extended to point to the alternative path, and the API of B2SHARE would have to offer the alternative access paths to a client. This latter aspect has not been implemented. Moving a DO from B2SHARE to CORDRA would result in changing the ownership of the PID and adding the updated information about the DO.This adaptation was not done yet, but since this archive has some special functionalities, it is interesting to discuss the way of adaptation which could be chosen. In the DOBES archive each bundle of closely related digital objects is assigned a Handle and also metadata is treated as a digital object, i.e., it has a separate Handle. For management reasons and especially for enabling different contributors to maintain control of access rights, a tree structure was developed to allow contributors to organise their data according to specific criteria and users to browse the archive in addition to execute searches on the metadata.While accessing archival objects is comparatively simple, the ingest/upload feature is more complex. It should be noted that the archive supports establishing a canonical tree of resources to define scopes for authorisation (define who has the right to grant access permissions, etc.), and facilitating lookup by supporting browsing according to understandable criteria. Therefore, depositors need to specify where in the tree the new resources should be integrated, and which initial rights are associated with them. After uploading the gathered information into a workspace, the archive carries out many checks in a micro-workflow: metadata is checked against vocabularies and partly curated, types of bit-sequences are checked and aligned with the information in the metadata, etc. An operation has been developed which is called gatekeeper to ensure a highly consistent archive despite the many (remote) people contributing to its content. Thus, the archive requires a set of 4 information units being specified:the set of bit-sequences to be uploaded,the metadata describing the bundle,the node to be used to organise the resources andthe initial rights where the default would be “open”.Adapting this archive to DOIP would imply that the proxy provides a set of operations such as “ingest a complex object”, “update metadata”, “add another bit-sequence to a specific object”, “get me the list of operations”, “give me the metadata”, etc. A client must be developed to do the front-end interaction with a user allowing them to specify the required information and to choose a suitable operation. Then the client would have to interact with the repository via DOIP by starting, for example, the gatekeeper as an external operation

    Interacting FDOs for Secure Processes

    No full text
    In modern industry, administration and research there are many processes that involve distributed actors needing to securely create, update and manage information. Typical examples for such processes are supply chains in production industry and treatments in the medical area. Such a process can be characterised by a few key properties:they are driven by discrete events in time that need to be recordedthey allow different authenticated actors contribute to state informationthey must guarantee that existing information cannot be overwrittenthey are characterised by a high degree of automationNot all applications will require that all properties be met, there are also workflow processes in the research domain, for example. In this paper we will discuss the use case where an FAIR Digital Objects (FDO) is used as a digital surrogate for a physical product, specifically to act as a Digital Product Pass (DPP) which is an electronic document that fully describes the properties of a given product with its own unique global identifier. Each digital object surrogate can then be represented by rendering its ID as a QR code which can then easily be scanned by a client to access information about the object or to interact with that object. To constrain the scope of our example, we will only discuss what happens when a product leaves the factory, is put on a truck together with other products and is shipped to a destination. The requirement in our case is to adapt the DPP so it includes the greenhouse gas emissions incurred by the product during its shipment. In this process we basically have the following events:the product is identified and its manufacturing details specified.the product enters the truck and is detected andthe product leaves the truck.In all three events some interactions and information updates need to be executed automatically, i.e. we assume that the product is associated with a sensible identity which can be read by a sensor coupled with an IoT edge device on the truck.In the general case, our model describes interactions between FDOs where any FDO can potentially interact with any other FDO as their physical objects interact in the physical world. Any FDO that can authenticate itself using a Public Key Iinfrastructure challenge and have the proper credentials will be able to add to the state of another FDO. Whenever two FDO interact, each FDO can register the interaction as an event FDO that is recorded at a location specified within each FDO. The ability to register an event can require a different sort of authentication and access control but a simple validated digital signature from the creator of the event is a simple yet effective way to control access.Our example includes 3 entities the factory (F), the truck company (TC) and a third party that acts as trusted entity (TE) to manage shared information. Each entity is represented as an FDO containing a public key that it can use to authenticate itself as well as a certificate of that key from a trusted entity. The factory instantiates a Product FDO (FDO-Px) for each product and based on an agreement with the trusted entity a DPP for that product-(FDO-Dx). The truck company also instantiates a Truck FDO (FDO-Ty). Each FDO has a public key and a certificate. This certificate would reflect the agreement between the factory and the truck company that authorizes each other to be able to create event FDOs (FDO-Ez) used, record each encounter between their FDOs, and potentially the option to extend the DPP FDO (FDO-Dx). Each FDO also has its own set of methods which can be executed, and which make use of secure communication and exchange their public key.The first interaction is triggered when the product enters the truck and is detected by the truck’s edge device. This edge device is configured to cause the FDO-Ty to register an event by invoking a pre-determined method and passing the ID of the product it detected.FDO-Ty has a few methods that allow it to inform FDO-Fx about the event and will probably have access to create some information in the truck company’s database.FDO-Px will have methods to update the appropriate database in the factory so that the factory can trace what happened.FDO-Ty will also be able to create an event FDO FDO-Ex using the FDO-Px event method and trigger clock to wait on a message from FDO-Px.When both FDOs have informed the event FDO that a specific event type happened, the FDO-Ex will use a method to update its event table and the event is signed by both keys.The second interaction happens when the product leaves the truck and the truck’s edge device sensors notice this action. The same procedure will happen again with one extension: (x1) Now the truck FDO-Ty will do some computations according to some algorithm instantiated by the truck company about the additional GHG emissions associated with the transport of the product (x2). This will cause the DPP FDO, FDO-Dx, to update a data structure maintained by a trusted party.The benefits of this method are as follows:All digital surrogates are FDOs and provide a standardized access method.All structures are encapsulated and can only be manipulated by tested methods embedded in the corresponding FDOs.Methods are extensible and are themselves defined as FDOs.All events will be signed by the keys of both parties involved making them authenticated and traceable.The systematic use of PIDs makes it possible to follow each action by appropriate analysis functions that have the right to read using methods in the corresponding FDOs.The system can be easily extended to different scenarios and different numbers of actors involve

    Multi-Stakeholder Global Handle Registry - Enabling Digital Sovereignty

    No full text
    <p>The DONA Foundation (DONA) was constituted in Geneva in early 2014 in part to administer and maintain the stable operation of the Global Handle Registry (GHR) along with multiple parties around the globe known as the Multi-Primary Administrators (MPAs). Responsibility for the GHR, previously held solely by CNRI in Reston Virginia USA, was transferred to the DONA Foundation in May, 2014. Since then, five MPAs have been authorized and credentialed by DONA to provide global handle services based on their credential. New organizations are currently in the process of being considered for authorization as future MPAs by the DONA Board of Directors.</p><p>The MPA-based approach to operating the GHR is key to enabling operations and management of the Handle System on a distributed multi-stakeholder basis. In this session, we will describe the operation and administration of the GHR on a multi-primary basis, under the overall administration of DONA, and will present some of the related standards, operating policies and procedures.</p

    ADL-R: The First Instance of a CORDRA Registry

    No full text
    The Advanced Distributed Learning Registry (ADL-R) is a newly operational registration system for distributed e-learning content in the U.S. military. It is the first instance of a registry-based approach to repository federation resulting from the Content Object Repository Discovery and Registration/Resolution Architecture (CORDRA) project. This article will provide a brief overview of CORDRA and detailed information on ADL-R. A subsequent article in this month's issue of D-Lib will describe FeDCOR, which uses the same approach to federate DSpace repositories

    The Vision of the FAIR Digital Object Machine and Ubiquitous FDO Services

    Get PDF
    In addition to the previous intensive discussion on the “Data Deluge” with respect to enormous increase of available research data, the 2022 Internet-of-Things conference confirmed that in the near future there will be billions if not trillions of smart IoT devices in a very wide range of applications and locations, many of them with computational capacities. This large number of distributed IoT devices will create continuous streams of data that will require a global framework to facilitate their integration into the Internet to enable controlled access to their data and services, to name but a few aspects. This framework would enable tracking of these IoT devices to measure their resource usage for instance to globally address the UN Sustainable Development Goals. Additionally, policy makers are committed to define regulations to break data monopolies and increase sharing. The result will be an increasingly huge domain of accessible digital data which on the one hand allows addressing new challenges especially cross-sector ones. A key prerequisite for this is to find the right data across domain boundaries supporting a specific task.Digitisation is already being called the fourth industrial revolution and the emerging data and information is the 21st century's new resource. Currently this vision is mostly unrealised due to the inability of existing data and digital resources to be findable, accessible, interoperable, and reusable despite the progress in providing thematic catalogs. As a result, the capacity of this new resource is latent and mostly underutilized. There is no Internet level infrastructure that currently exists to facilitate the process by which all data and digital resources are made consistently and globally accessible. There are patchworks of localized and limited access to trusted data on the Internet created by specific communities that have been funded or directed to collaborate.To turn digital information into a commodity, description, access to, validation, and processing of data needs to become part of the Internet infrastructure we call the Global Integrated Data Space (GIDS). The main pillars of this approach require that data and services be globally identified and consistently accessed, with predictive descriptions and access control to make them globally findable.Currently researchers are relying partly on informal knowledge such as knowing the labs and persons to maximize the chance to access trustworthy data, but this method is limiting the use of suitable data. In the future data scenario, other mechanisms will become possible. In the public information space Google-like searches using specific key terms have become an accepted solution to find documents for human consumption. This approach however, does not work in the GIDS with large numbers of data contributors from a wide range of institutions, from millions of IoT devices worldwide, and where a wide range of data types and automatic data processing procedures dominate. Indeed, successful labs that apply complex models describing digital surrogates can automatically leverage data and data processing procedures from other labs. This makes the currently often operationally applied manual stitching of data and operations too costly both in time and resources to be a competitive option. A researcher looking for specific brain imaging data for a specific study has a few options:Rely on a network of colleagues.Execute Google-like searches in known registries looking for appropriate departments and researchers.Execute Google-like searches on suitable data.He/she engages an agent to execute profile matching in suitable sub-spaces.We assume that data creators will have the capability and be interested to create detailed metadata of different types and that the researchers, who are looking for specific data, will be able to specify precise profiles for data they are looking for. Two of the key characteristics of the future data space will be operations that can carry out profile matching at ultra-high speeds and that will lead to various subspaces according to some facets using self-organizing mechanisms. Of course, this poses high requirements on the metadata quality being used and that creators and potential consumers share knowledge about the semantic space in which they operate, and available semantic mappings used by brokers or self-provided. Metadata must be highly detailed and suitable schemas have been developed already in many communities. In addition to the usual metadata, potential users will need to specify their credentials in the form of trusted roles and their usage purposes to indicate access opportunities.Changing current metadata practices to yield richer metadata as presribed by the FAIR principles will not be simple, especially since we seem to be far away from formalizing roles and usage purposes in a broadly accepted way, but the pressure to create rich and standardized metadata will increase. It should be noted of course that for data streams created by IoT sensors, defining proper metadata is an action that is only requested once or a few times.Why are FDOs special in this automatic profile matching scenario? FDOs are bundling all information required for automatic profile matching in a secure way, i.e., all metadata information are available via the gloablly unique resolvable and persisten identifiers (PID) of the FDO and the PID security mechanisms are at the basis to establish trust. FDOs will be provided with a secure method that is capable of computing efficiently coded profiles representing all properties of an FDO relevant for profile matching. This would speedup profile matching enormously.We will address two major questions characterizing the “FDO Machine” we are envisioning:Which kinds of representations could make profile matching much more efficient?How could FDO-based mechanisms be used to efficiently create sub-spaces that would help the emerging layer of information brokers to offer specialized services addressing specialized needs as for example requested by UN’s Sustainable Development Goals?Brokers might want to use specialized agents to create subspaces along many different important facets such as domains, trustworthiness, roles, etc. Such subspaces are ephemeral virtual structures on top of the huge global integrated data space

    BioHackEU23 report: Enabling FAIR Digital Objects with RO-Crate, Signposting and Bioschemas

    No full text
    As part of the BioHackathon Europe 2023, we here report from the progress of the hackathon project #15: "Enabling FAIR Digital Objects with RO-Crate, Signposting and Bioschemas". We added Signposting to three existing resources, and made a Chrome browser extension to show Signposting headers. We added RO-Crate to two existing resources, and explored making a hybrid FDO using both a Handle PID Record and Signposting/RO-Crate approach

    FAIR Digital Object Demonstrators 2021

    Get PDF
    This paper gives a summary of implementation activities in the realm of FAIR Digital Objects (FDO). It gives an idea which software components are robust and used for many years, which components are comparatively new and are being tested out in pilot projects and what the challenges are that need to be urgently addressed by the FDO community. After basically only one year of advancing the FDO specifications by the FDO Forum we can recognise an increasing momentum to test and integrate essential FDO components. However, many developments still occur as soloistic engagements that offer a scattered picture. It is widely agreed that it is now time to combine these different pilots to comprehensive testbeds, to identify still existing gaps and to turn some services into components of a convincing and stable infrastructure. This step is urgently needed to convince even more institutions to invest in FDO technology and therefore to increase FAIRness of the evolving global data space

    FAIR Digital Object Demonstrators 2021

    Get PDF
    This paper gives a summary of implementation activities in the realm of FAIR Digital Objects (FDO). It gives an idea which software components are robust and used for many years, which components are comparatively new and are being tested out in pilot projects and what the challenges are that need to be urgently addressed by the FDO community. After basically only one year of advancing the FDO specifications by the FDO Forum we can recognise an increasing momentum to test and integrate essential FDO components. However, many developments still occur as soloistic engagements that offer a scattered picture. It is widely agreed that it is now time to combine these different pilots to comprehensive testbeds, to identify still existing gaps and to turn some services into components of a convincing and stable infrastructure. This step is urgently needed to convince even more institutions to invest in FDO technology and therefore to increase FAIRness of the evolving global data space
    corecore