9 research outputs found

    Uncovering the Semantics of Wikipedia Pagelinks

    Full text link

    Entity deduplication on ScholarlyData

    No full text
    ScholarlyData is the new and currently the largest reference linked dataset of the Semantic Web community about papers, people, organisations, and events related to its academic conferences. Originally started from the Semantic Web Dog Food (SWDF), it addressed multiple issues on data representation and maintenance by (i) adopting a novel data model and (ii) establishing an open source workflow to support the addition of new data from the community. Nevertheless, the major issue with the current dataset is the presence of multiple URIs for the same entities, typically in persons and organisations. In this work we: (i) perform entity deduplication on the whole dataset, using supervised classification methods; (ii) devise a protocol to choose the most representative URI for an entity and deprecate duplicated ones, while ensuring backward compatibilities for them; (iii) incorporate the automatic deduplication step in the general workflow to reduce the creation of duplicate URIs when adding new data. Our early experiment focused on the person and organisation URIs and results show significant improvement over state-of-the-art solutions. We managed to consolidate, on the entire dataset, over 100 and 800 pairs of duplicate person and organisation URIs and their associated triples (over 1,800 and 5,000) respectively, hence significantly improving the overall quality and connectivity of the data graph. Integrated into the ScholarlyData data publishing workflow, we believe that this serves a major step towards the creation of clean, high-quality scholarly linked data on the Semantic Web

    SQuAP-Ont: An ontology of software quality relational factors from financial systems

    No full text
    Quality, architecture, and process are considered the keystones of software engineering. ISO defines them in three separate standards. However, their interaction has been scarcely studied, so far. The SQuAP model (Software Quality, Architecture, Process) describes twenty-eight main factors that impact on software quality in banking systems, and each factor is described as a relation among some characteristics from the three ISO standards. Hence, SQuAP makes such relations emerge rigorously, although informally. In this paper, we present SQuAP-Ont, an OWL ontology designed by following a well-established methodology based on the re-use of Ontology Design Patterns (i.e. ODPs). SQuAP-Ont formalises the relations emerging from SQuAP to represent and reason via Linked Data about software engineering in a three-dimensional model consisting of quality, architecture, and process ISO characteristics

    An ontology design pattern for representing recurrent events

    No full text
    In this paper we describe an Ontology Design Pattern for modeling events that recur regularly over time and share some invariant factors, which unify them conceptually. The proposed pattern appears to be foundational, since it models the top-level domain-independent concept of recurrence, as applied to a series of events: we refer to this type of events as recurrent event series. The pattern relies on existing patterns, i.e. Collection, Situation, Classification, Sequence. Indeed, a recurrent event is represented as both a collection of events and a situation in which these events are contextualized and unified according to one or more properties that are peculiar to each event, and occur at regular intervals. We show how this pattern has been used in the context of ArCo, the Knowledge Graph of Italian cultural heritage, in order to model recurrent cultural events, festivals, ceremonies

    Atlas of paths: A formal ontology of historical pathways in Italy

    No full text
    The Atlas of Paths project has two main goals: (i) the creation and implementation of an ontology network representing information contained in the MiBACT\u2019s Atlante dei Cammini d\u2019Italia and defining the concept of path; (ii) the design of a prototype for a modular software platform allowing the production of the Atlante Linked Open Data as foreseen in its ontological formalization

    Predicting the results of evaluation procedures of academics

    No full text
    Background. The 2010 reform of the Italian university system introduced the National Scientific Habilitation (ASN) as a requirement for applying to permanent professor positions. Since the CVs of the 59,149 candidates and the results of their assessments have been made publicly available, the ASN constitutes an opportunity to perform analyses about a nation-wide evaluation process. Objective. The main goals of this paper are: (i) predicting the ASN results using the information contained in the candidates’ CVs; (ii) identifying a small set of quantitative indicators that can be used to perform accurate predictions. Approach. Semantic technologies are used to extract, systematize and enrich the information contained in the applicants’ CVs, and machine learning methods are used to predict the ASN results and to identify a subset of relevant predictors. Results. For predicting the success in the role of associate professor, our best models using all and the top 15 predictors make accurate predictions (F-measure values higher than 0.6) in 88% and 88.6% of the cases, respectively. Similar results have been achieved for the role of full professor. Evaluation. The proposed approach outperforms the other models developed to predict the results of researchers’ evaluation procedures. Conclusions. Such results allow the development of an automated system for supporting both candidates and committees in the future ASN sessions and other scholars’ evaluation procedures

    Mapping Keywords to Linked Data Resources for Automatic Query Expansion

    No full text
    Abstract. Linked Data is a gigantic, constantly growing and extremely valuable resource, but its usage is still heavily dependent on (i) the familiarity of end users with RDF’s graph data model and its query language, SPARQL, and (ii) knowledge about available datasets and their contents. Intelligent keyword search over Linked Data is currently being investigated as a means to overcome these barriers to entry in a number of different approaches, including semantic search engines and the automatic conversion of natural language questions into structured queries. Our work addresses the specific challenge of mapping keywords to Linked Data resources, and proposes a novel method for this task. By exploiting the graph structure within Linked Data we determine which properties between resources are useful to discover, or directly express, semantic similarity. We also propose a novel scoring function to rank results. Experiments on a publicly available dataset show a 17 % improvement in Mean Reciprocal Rank over the state of the art.

    Automatic Typing of DBpedia Entities

    No full text
    corecore