12,609 research outputs found

    Semantic Integration of Identity Data Repositories

    Get PDF
    With the continuously growing number of distributed and heterogeneous IT systems there is the need for structured and efficient identity management (IdM) processes. This implies that new users are created once and then the information is distributed to all applicable software systems same as if changes on existing user objects occur. The central issue is that there is no generally ac-cepted standard for handling this information distribution because each system has its own internal representation of this data. Our approach is to give a se-mantic definition of the digital user objects attributes to ease the mapping process of an abstract user object to the concrete instantiation of each software system. Therefore we created an ontology to define the mapping of users at-tributes as well as an architecture which enables the semantic integration of identity data repositories. Our solution has been tested in an implementation case study

    Distributed human computation framework for linked data co-reference resolution

    No full text
    Distributed Human Computation (DHC) is a technique used to solve computational problems by incorporating the collaborative effort of a large number of humans. It is also a solution to AI-complete problems such as natural language processing. The Semantic Web with its root in AI is envisioned to be a decentralised world-wide information space for sharing machine-readable data with minimal integration costs. There are many research problems in the Semantic Web that are considered as AI-complete problems. An example is co-reference resolution, which involves determining whether different URIs refer to the same entity. This is considered to be a significant hurdle to overcome in the realisation of large-scale Semantic Web applications. In this paper, we propose a framework for building a DHC system on top of the Linked Data Cloud to solve various computational problems. To demonstrate the concept, we are focusing on handling the co-reference resolution in the Semantic Web when integrating distributed datasets. The traditional way to solve this problem is to design machine-learning algorithms. However, they are often computationally expensive, error-prone and do not scale. We designed a DHC system named iamResearcher, which solves the scientific publication author identity co-reference problem when integrating distributed bibliographic datasets. In our system, we aggregated 6 million bibliographic data from various publication repositories. Users can sign up to the system to audit and align their own publications, thus solving the co-reference problem in a distributed manner. The aggregated results are published to the Linked Data Cloud

    The CHAIN-REDS Semantic Search Engine

    Get PDF
    e-Infrastructures, and in particular Data Repositories and Open Access Data Infrastructures, are essential platforms for e-Science and e-Research and are being built since several years both in Europe and the rest of the world to support diverse multi/inter-disciplinary Virtual Research Communities. So far, however, it is difficult for scientists to correlate papers to datasets used to produce them and to discover data and documents in an easy way. In this paper, the CHAINREDS project’s Knowledge Base and its Semantic Search Engine are presented, which attempt to address those drawbacks and contribute to the reproducibility of science

    Towards information profiling: data lake content metadata management

    Get PDF
    There is currently a burst of Big Data (BD) processed and stored in huge raw data repositories, commonly called Data Lakes (DL). These BD require new techniques of data integration and schema alignment in order to make the data usable by its consumers and to discover the relationships linking their content. This can be provided by metadata services which discover and describe their content. However, there is currently a lack of a systematic approach for such kind of metadata discovery and management. Thus, we propose a framework for the profiling of informational content stored in the DL, which we call information profiling. The profiles are stored as metadata to support data analysis. We formally define a metadata management process which identifies the key activities required to effectively handle this.We demonstrate the alternative techniques and performance of our process using a prototype implementation handling a real-life case-study from the OpenML DL, which showcases the value and feasibility of our approach.Peer ReviewedPostprint (author's final draft

    Quality interoperability within digital libraries: the DL.org perspective

    Get PDF
    Quality is the most dynamic aspect of DLs, and becomes even more complex with respect to interoperability. This paper formalizes the research motivations and hypotheses on quality interoperability conducted by the Quality Working Group within the EU-funded project DL.org (<a href="http://www.dlorg.eu">http://www.dlorg.eu/</a>). After providing a multi-level interoperability framework – adopted by DL.org - the authors illustrate key-research points and approaches on the way to the interoperability of DLs quality, grounding them in the DELOS Reference Model. By applying the DELOS Reference Model Quality Concept Map to their interoperability motivating scenario, the authors subsequently present the two main research outcomes of their investigation - the Quality Core Model and the Quality Interoperability Survey
    corecore