13 research outputs found

    Challenges as enablers for high quality linked data: Insights from the semantic publishing challenge

    Get PDF
    While most challenges organized so far in the Semantic Web domain are focused on comparing tools with respect to different criteria such as their features and competencies, or exploiting semantically enriched data, the Semantic Web Evaluation Challenges series, co-located with the ESWC Semantic Web Conference, aims to compare them based on their output, namely the produced dataset. The Semantic Publishing Challenge is one of these challenges. Its goal is to involve participants in extracting data from heterogeneous sources on scholarly publications, and producing Linked Data that can be exploited by the community itself. This paper reviews lessons learned from both (i) the overall organization of the Semantic Publishing Challenge, regarding the definition of the tasks, building the input dataset and forming the evaluation, and (ii) the results produced by the participants, regarding the proposed approaches, the used tools, the preferred vocabularies and the results produced in the three editions of 2014, 2015 and 2016. We compared these lessons to other Semantic Web Evaluation Challenges. In this paper, we (i) distill best practices for organizing such challenges that could be applied to similar events, and (ii) report observations on Linked Data publishing derived from the submitted solutions. We conclude that higher quality may be achieved when Linked Data is produced as a result of a challenge, because the competition becomes an incentive, while solutions become better with respect to Linked Data publishing best practices when they are evaluated against the rules of the challenge

    Extracting knowledge from text using SHELDON, a Semantic Holistic framEwork for LinkeD ONtology data. In:

    Get PDF
    ABSTRACT SHELDON 1 is the first true hybridization of NLP machine reading and the Semantic Web. It extracts RDF data from text using a machine reader: the extracted RDF graphs are compliant to Semantic Web and Linked Data. It goes further and applies Semantic Web practices and technologies to extend the current human-readable web. The input is represented by a sentence in any language. SHELDON includes different capabilities in order to extend machine reading to Semantic Web data: frame detection, topic extraction, named entity recognition, resolution and coreference, terminology extraction, sense tagging and disambiguation, taxonomy induction, semantic role labeling, type induction, sentiment analysis, citation inference, relation and event extraction, nice visualization tools which make use of the JavaScript infoVis Toolkit and RelFinder. A demo of SHELDON can be seen and used at http://wit.istc.cnr. it/stlab-tools/sheldon

    Executing, Comparing, and Reusing Linked Data-Based Recommendation Algorithms With the Allied Framework

    Get PDF
    International audienceData published on the Web following the Linked Data principles has resulted in a global data space called the Web of Data. These principles led to semantically interlink and connect different resources at data level regardless their structure, authoring, location, etc. The tremendous and continuous growth of the Web of Data also implies that now it is more likely to find resources that describe real-life concepts. However, discovering and recommending relevant related resources is still an open research area. This chapter studies recommender systems that use Linked Data as a source containing a significant amount of available resources and their relationships useful to produce recommendations. Furthermore, it also presents a framework to deploy and execute state-of-the-art algorithms for Linked Data that have been re-implemented to measure and benchmark them in different application domains and without being bound to a unique dataset

    Introducing linked open data in graph-based recommender systems

    Get PDF
    Thanks to the recent spread of the Linked Open Data (LOD) initiative, a huge amount of machine-readable knowledge encoded as RDF statements is today available in the so-called LOD cloud. Accordingly, a big effort is now spent to investigate to what extent such information can be exploited to develop new knowledge-based services or to improve the effectiveness of knowledge-intensive platforms as Recommender Systems (RS). To this end, in this article we study the impact of the exogenous knowledge coming from the LOD cloud on the overall performance of a graph-based recommendation framework. Specifically, we propose a methodology to automatically feed a graph-based RS with features gathered from the LOD cloud and we analyze the impact of several widespread feature selection techniques in such recommendation settings. The experimental evaluation, performed on three state-of-the-art datasets, provided several outcomes: first, information extracted from the LOD cloud can significantly improve the performance of a graph-based RS. Next, experiments showed a clear correlation between the choice of the feature selection technique and the ability of the algorithm to maximize specific evaluation metrics, as accuracy or diversity of the recommendations. Moreover, our graph-based algorithm fed with LOD-based features was able to overcome several baselines, as collaborative filtering and matrix factorization

    Neural Approaches to Relational Aspect-Based Sentiment Analysis. Exploring generalizations across words and languages

    Get PDF
    Jebbara S. Neural Approaches to Relational Aspect-Based Sentiment Analysis. Exploring generalizations across words and languages. Bielefeld: Universität Bielefeld; 2020.Everyday, vast amounts of unstructured, textual data are shared online in digital form. Websites such as forums, social media sites, review sites, blogs, and comment sections offer platforms to express and discuss opinions and experiences. Understanding the opinions in these resources is valuable for e.g. businesses to support market research and customer service but also individuals, who can benefit from the experiences and expertise of others. In this thesis, we approach the topic of opinion extraction and classification with neural network models. We regard this area of sentiment analysis as a relation extraction problem in which the sentiment of some opinion holder towards a certain aspect of a product, theme, or event needs to be extracted. In accordance with this framework, our main contributions are the following: 1. We propose a full system addressing all subtasks of relational sentiment analysis. 2. We investigate how semantic web resources can be leveraged in a neural-network-based model for the extraction of opinion targets and the classification of sentiment labels. Specifically, we experiment with enhancing pretrained word embeddings using the lexical resource WordNet. Furthermore, we enrich a purely text-based model with SenticNet concepts and observe an improvement for sentiment classification. 3. We examine how opinion targets can be automatically identified in noisy texts. Customer reviews, for instance, are prone to contain misspelled words and are difficult to process due to their domain-specific language. We integrate information about the character structure of a word into a sequence labeling system using character-level word embeddings and show their positive impact on the system's performance. We reveal encoded character patterns of the learned embeddings and give a nuanced view of the obtained performance differences. 4. Opinion target extraction usually relies on supervised learning approaches. We address the lack of available annotated data for specific languages by proposing a zero-shot cross-lingual approach for the extraction of opinion target expressions. We leverage multilingual word embeddings that share a common vector space across various languages and incorporate these into a convolutional neural network architecture. Our experiments with 5 languages give promising results: We can successfully train a model on annotated data of a source language and perform accurate prediction on a target language without ever using any annotated samples in that target language

    A Personal Research Agent for Semantic Knowledge Management of Scientific Literature

    Get PDF
    The unprecedented rate of scientific publications is a major threat to the productivity of knowledge workers, who rely on scrutinizing the latest scientific discoveries for their daily tasks. Online digital libraries, academic publishing databases and open access repositories grant access to a plethora of information that can overwhelm a researcher, who is looking to obtain fine-grained knowledge relevant for her task at hand. This overload of information has encouraged researchers from various disciplines to look for new approaches in extracting, organizing, and managing knowledge from the immense amount of available literature in ever-growing repositories. In this dissertation, we introduce a Personal Research Agent that can help scientists in discovering, reading and learning from scientific documents, primarily in the computer science domain. We demonstrate how a confluence of techniques from the Natural Language Processing and Semantic Web domains can construct a semantically-rich knowledge base, based on an inter-connected graph of scholarly artifacts – effectively transforming scientific literature from written content in isolation, into a queryable web of knowledge, suitable for machine interpretation. The challenges of creating an intelligent research agent are manifold: The agent's knowledge base, analogous to his 'brain', must contain accurate information about the knowledge `stored' in documents. It also needs to know about its end-users' tasks and background knowledge. In our work, we present a methodology to extract the rhetorical structure (e.g., claims and contributions) of scholarly documents. We enhance our approach with entity linking techniques that allow us to connect the documents with the Linked Open Data (LOD) cloud, in order to enrich them with additional information from the web of open data. Furthermore, we devise a novel approach for automatic profiling of scholarly users, thereby, enabling the agent to personalize its services, based on a user's background knowledge and interests. We demonstrate how we can automatically create a semantic vector-based representation of the documents and user profiles and utilize them to efficiently detect similar entities in the knowledge base. Finally, as part of our contributions, we present a complete architecture providing an end-to-end workflow for the agent to exploit the opportunities of linking a formal model of scholarly users and scientific publications

    Exploring semantic relationships in the web of data

    Get PDF
    corecore