1,057 research outputs found

    A Survey of the First 20 Years of Research on Semantic Web and Linked Data

    Get PDF
    International audienceThis paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several bibliographical sources and bibliometric indicators , we identify the main research trends and we reference some of their major publications to provide an overview of that initial period. We conclude with some perspectives for the future research challenges.Cet article est une étude des sujets de recherche dans le domaine du Web sémantique, des données liées et du Web des données. Cette étude se penche sur les contributions de cette communauté de recherche au cours de ses vingt premières années d'existence. En compilant plusieurs sources bibliographiques et indicateurs bibliométriques, nous identifions les principales tendances de la recherche et nous référençons certaines de leurs publications majeures pour donner un aperçu de cette période initiale. Nous concluons avec une discussion sur les tendances et perspectives de recherche

    Scalable Data Integration for Linked Data

    Get PDF
    Linked Data describes an extensive set of structured but heterogeneous datasources where entities are connected by formal semantic descriptions. In thevision of the Semantic Web, these semantic links are extended towards theWorld Wide Web to provide as much machine-readable data as possible forsearch queries. The resulting connections allow an automatic evaluation to findnew insights into the data. Identifying these semantic connections betweentwo data sources with automatic approaches is called link discovery. We derivecommon requirements and a generic link discovery workflow based on similaritiesbetween entity properties and associated properties of ontology concepts. Mostof the existing link discovery approaches disregard the fact that in times ofBig Data, an increasing volume of data sources poses new demands on linkdiscovery. In particular, the problem of complex and time-consuming linkdetermination escalates with an increasing number of intersecting data sources.To overcome the restriction of pairwise linking of entities, holistic clusteringapproaches are needed to link equivalent entities of multiple data sources toconstruct integrated knowledge bases. In this context, the focus on efficiencyand scalability is essential. For example, reusing existing links or backgroundinformation can help to avoid redundant calculations. However, when dealingwith multiple data sources, additional data quality problems must also be dealtwith. This dissertation addresses these comprehensive challenges by designingholistic linking and clustering approaches that enable reuse of existing links.Unlike previous systems, we execute the complete data integration workflowvia a distributed processing system. At first, the LinkLion portal will beintroduced to provide existing links for new applications. These links act asa basis for a physical data integration process to create a unified representationfor equivalent entities from many data sources. We then propose a holisticclustering approach to form consolidated clusters for same real-world entitiesfrom many different sources. At the same time, we exploit the semantic typeof entities to improve the quality of the result. The process identifies errorsin existing links and can find numerous additional links. Additionally, theentity clustering has to react to the high dynamics of the data. In particular,this requires scalable approaches for continuously growing data sources withmany entities as well as additional new sources. Previous entity clusteringapproaches are mostly static, focusing on the one-time linking and clustering ofentities from few sources. Therefore, we propose and evaluate new approaches for incremental entity clustering that supports the continuous addition of newentities and data sources. To cope with the ever-increasing number of LinkedData sources, efficient and scalable methods based on distributed processingsystems are required. Thus we propose distributed holistic approaches to linkmany data sources based on a clustering of entities that represent the samereal-world object. The implementation is realized on Apache Flink. In contrastto previous approaches, we utilize efficiency-enhancing optimizations for bothdistributed static and dynamic clustering. An extensive comparative evaluationof the proposed approaches with various distributed clustering strategies showshigh effectiveness for datasets from multiple domains as well as scalability on amulti-machine Apache Flink cluster

    Deliverable D9.3 Final Project Report

    Get PDF
    This document comprises the final report of LinkedTV. It includes a publishable summary, a plan for use and dissemination of foreground and a report covering the wider societal implications of the project in the form of a questionnaire

    Integration of Scientific Information through Linked Data : Preliminary Report

    Get PDF
    By implementing Linked Open Data principles throughout the scientific community, it is possible to make publications more visible and foster collaboration, both between universities, researching groups and partners. However, data should be curated and published, and a mature infrastructure needs to be provided to support it. In this work, we analyse the Linked Data’s weaknesses and propose an application prototype to evaluate the state-of-the-art methodologies and tools. As case of study, data from the authors’ researching groups will be publicly available in an attempt to integrate the scientific data of the Argentinian community which is an open issue. We plan to use this data to generate bottom-up methodological guidelines and thus enrich ontology-based conceptual models.Sociedad Argentina de Informática e Investigación Operativa (SADIO

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available

    Cross-Platform Text Mining and Natural Language Processing Interoperability - Proceedings of the LREC2016 conference

    Get PDF
    No abstract available
    • …
    corecore