12 research outputs found

    From many records to one graph: Heterogeneity conflicts in the Linked data restructuring cycle

    Get PDF
    Introduction. During the last couple of years the library community has developed a number of comprehensive metadata standardization projects inspired by the idea of linked data, such as the BIBFRAME model. Linked data is a set of best practice principles of publishing and exposing data on the Web utilizing a graph based data model powered with semantics and cross-domain relationships. In the light of traditional metadata practices of libraries the best practices of linked data imply a restructuring process from a collection of semi-structured bibliographic records to a semantic graph of unambiguously defined entities. A successful interlinking of entities in this graph to entities in external data sets requires a minimum level of semantic interoperability. Method The examination is carried out through a review of the relevant research within the field and of the essential documents that describe the key concepts. Analysis A high level examination of the concepts of the semantic Web and linked data is provided with a particular focus on the challenges they entail for libraries and their meta-data practices in the perspective of the extensive restructuring process that has already started. Conclusion We demonstrate that a set of heterogeneity conflicts, threatening the level of semantic interoperability, can be associated with various phases of this restructuring process from analysis and modelling to conversion and external interlinking. It also claims that these conflicts and their potential solutions are mutually dependent across the phases

    Datasets Management as a Special Collection

    Get PDF
    There are several dataset management challenges to be faced in the coming years. The incorporation of the datasets into special collections will be a challenge. As well as formats management, libraries with datasets will have to deal with issues such as right management, interoperability or the election adequacy to the end-user and findability. However, it must be recognized that incompatibilities could be solved through data auditing. The appropriate type of digital preservation strategy will have to be considered in order to maintain accessibility.This paper presents a review of the literature that discusses datasets in special collections

    SPECIALIZING RDFS : SEE ALSO IN SEMANTIC WEB

    Get PDF
    ABSTRAC

    Local and Global Semantic Networks for the Representation of Music Information

    Get PDF
    In the field of music informatics, multilayer representation formats are becoming increasingly important, since they enable an integrated and synchronized representation of the various entities that describe a piece of music, from the digital encoding of score symbols to its typographic aspects and audio recordings. Often these formats are based on the eXtensible Markup Language (XML), that allows information embedding, hierarchical structuring and interconnection within a single document. Simultaneously, the advent of the so-called Semantic Web is leading to the transformation of the World Wide Web into an environment where documents are associated with data and metadata. XML is extensively used also in the Semantic Web, since this format supports not only human- but also machine-readable tags. On the one side the Semantic Web aims to create a set of automatically-detectable relationships among data, thus providing users with a number of non-trivial paths to navigate information in a geographically distributed framework; on the other side, multilayer formats typically operate in a similar way, but at a \u201clocal\u201d level. The goal of the present work is to discuss the possibilities emerging from a combined approach, namely by adopting multilayer formats in the Semantic Web, addressing in particular augmented-reality applications. An XML-based international standard known as IEEE 1599 will be employed to show a number of innovative applications in music

    Integrating Personal Web Data through Semantically Enhanced Web Portal

    Get PDF
    Abstract: Currently, the World Wide Web is mostly composed of isolated and loosely connected "data islands". Connecting them together and retrieving only the information that is of interest to the user is the common Web usage process. Creating infrastructure that would support automation of that process by aggregating and integrating Web data in accordance to user's personal preferences would greatly improve today's Web usage. A significant part of Web data is available only through the login and password protected applications. As that data is very important for the usefulness of described process, proposed infrastructure needs to support authorized access to user's personal data. In this paper we propose a semantically enhanced Web portal that presents unique personalized user's entry to the domain-specific Web information. We also propose an identity management system that supports authorized access to the protected Web data. To verify the proposed solution, we have built Sweb -a semantically enhanced Web portal that uses proposed identity management system

    Linked Data based Health Information Representation, Visualization and Retrieval System on the Semantic Web

    Get PDF
    Dissertation submitted in partial fulfillment of the requirements for the Degree of Master of Science in Geospatial Technologies.To better facilitate health information dissemination, using flexible ways to represent, query and visualize health data becomes increasingly important. Semantic Web technologies, which provide a common framework by allowing data to be shared and reused between applications, can be applied to the management of health data. Linked open data - a new semantic web standard to publish and link heterogonous data- allows not only human, but also machine to brows data in unlimited way. Through a use case of world health organization HIV data of sub Saharan Africa - which is severely affected by HIV epidemic, this thesis built a linked data based health information representation, querying and visualization system. All the data was represented with RDF, by interlinking it with other related datasets, which are already on the cloud. Over all, the system have more than 21,000 triples with a SPARQL endpoint; where users can download and use the data and – a SPARQL query interface where users can put different type of query and retrieve the result. Additionally, It has also a visualization interface where users can visualize the SPARQL result with a tool of their preference. For users who are not familiar with SPARQL queries, they can use the linked data search engine interface to search and browse the data. From this system we can depict that current linked open data technologies have a big potential to represent heterogonous health data in a flexible and reusable manner and they can serve in intelligent queries, which can support decision-making. However, in order to get the best from these technologies, improvements are needed both at the level of triple stores performance and domain-specific ontological vocabularies

    Cross-Lingual Entity Matching for Knowledge Graphs

    Get PDF
    Multilingual knowledge graphs (KGs), such as YAGO and DBpedia, represent entities in different languages. The task of cross-lingual entity matching is to align entities in a source language with their counterparts in target languages. In this thesis, we investigate embedding-based approaches to encode entities from multilingual KGs into the same vector space, where equivalent entities are close to each other. Specifically, we apply graph convolutional networks (GCNs) to combine multi-aspect information of entities, including topological connections, relations, and attributes of entities, to learn entity embeddings. To exploit the literal descriptions of entities expressed in different languages, we propose two uses of a pre-trained multilingual BERT model to bridge cross-lingual gaps. We further propose two strategies to integrate GCN-based and BERT-based modules to boost performance. Extensive experiments on two benchmark datasets demonstrate that our method significantly outperforms existing systems. We additionally introduce a new dataset comprised of 15 low-resource languages and featured with unlinkable cases to draw closer to the real-world challenges
    corecore