24 research outputs found

    Where the streets have known names

    Get PDF
    Street names provide important insights into the local culture, history, and politics of places. Linked open data provide a wealth of knowledge that can be associated with street names, enabling novel ways to explore cultural geographies. This paper presents a three-fold contribution. We present (1) a technique to establish a correspondence between street names and the entities that they refer to. The method is based on Wikidata, a knowledge base derived from Wikipedia. The accuracy of this mapping is evaluated on a sample of streets in Rome. As this approach reaches limited coverage, we propose to tap local knowledge with (2) a simple web platform. Users can select the best correspondence from the calculated ones or add another entity not discovered by the automated process. As a result, we design (3) an enriched OpenStreetMap web map where each street name can be explored in terms of the properties of its associated entity. Through several filters, this tool is a first step towards the interactive exploration of toponymy, showing how open data can reveal facets of the cultural texture that pervades places

    Using Semantic Technologies in Digital Libraries- A Roadmap to Quality Evaluation

    Get PDF
    Abstract. In digital libraries semantic techniques are often deployed to reduce the expensive manual overhead for indexing documents, maintaining metadata, or caching for future search. However, using such techniques may cause a decrease in a collection’s quality due to their statistical nature. Since data quality is a major concern in digital libraries, it is important to be able to measure the (loss of) quality of metadata automatically generated by semantic techniques. In this paper we present a user study based on a typical semantic technique use

    RuBQ: A Russian Dataset for Question Answering over Wikidata

    Full text link
    The paper presents RuBQ, the first Russian knowledge base question answering (KBQA) dataset. The high-quality dataset consists of 1,500 Russian questions of varying complexity, their English machine translations, SPARQL queries to Wikidata, reference answers, as well as a Wikidata sample of triples containing entities with Russian labels. The dataset creation started with a large collection of question-answer pairs from online quizzes. The data underwent automatic filtering, crowd-assisted entity linking, automatic generation of SPARQL queries, and their subsequent in-house verification. The freely available dataset will be of interest for a wide community of researchers and practitioners in the areas of Semantic Web, NLP, and IR, especially for those working on multilingual question answering. The proposed dataset generation pipeline proved to be efficient and can be employed in other data annotation projects. © 2020, Springer Nature Switzerland AG.We thank Mikhail Galkin, Svitlana Vakulenko, Daniil Sorokin, Vladimir Kovalenko, Yaroslav Golubev, and Rishiraj Saha Roy for their valuable comments and fruitful discussion on the paper draft. We also thank Pavel Bakhvalov, who helped collect RuWikidata8M sample and contributed to the first version of the entity linking tool. We are grateful to Yandex.Toloka for their data annotation grant. PB acknowledges support by Ural Mathematical Center under agreement No. 075-02-2020-1537/1 with the Ministry of Science and Higher Education of the Russian Federation

    Software Testing Techniques Revisited for OWL Ontologies

    Get PDF
    Ontologies are an essential component of semantic knowledge bases and applications, and nowadays they are used in a plethora of domains. Despite the maturity of ontology languages, support tools and engineering techniques, the testing and validation of ontologies is a field which still lacks consolidated approaches and tools. This paper attempts at partly bridging that gap, taking a first step towards the extension of some traditional software testing techniques to ontologies expressed in a widely-used format. Mutation testing and coverage testing, revisited in the light of the peculiar features of the ontology language and structure, can can assist in designing better test suites to validate them, and overall help in the engineering and refinement of ontologies and software based on them

    ECCM15 -15 PREPARATION AND CHARACTERIZATION OF POLY(ETHYLENE OXIDE)/LITHIUM MONTMORILLONITE COMPOSITES

    No full text
    Abstract Poly(ethylene oxide)/lithium montmorillonite (PEO/LiMMT) composites were prepared by mixing and ultrasonication PEO and LiMMT in water. Characterization of PEO/LiMMT composites was performed by differential scanning calorimetry (DSC), Fourier transform infrared spectroscopy (FT-IR) and non-isothermal thermogravimetry (TGA). DSC and TGA reveal that higher loadings of LiMMT significantly influence the crystallinity, melting temperatures and thermal stability of PEO, respectively. Glass transition temperatures of PEO do not change with addition of LiMMT. FT-IR analysis shows that the helical structure of PEO chains is distorted in PEO/LiMMT composites. Changes of activation energy in composites compared to pure PEO indicate possible changes in the mechanism of the nonisothermal degradation of PEO due to addition of LiMMT. Introduction Poly(ethylene oxide) (PEO) as solid polymer electrolyte in lithium polymer batteries has many advantages over its liquid counterparts or organic solutions due to the ease of processing, stable electrochemical characteristics and excellent mechanical propertie

    Structural Properties as Proxy for Semantic Relevance in RDF Graph Sampling

    No full text
    The Linked Data cloud has grown to become the largest knowledge base ever constructed. Its size is now turning into a major bottleneck for many applications. In order to facilitate access to this structured information, this paper proposes an automatic sampling method targeted at maximizing answer coverage for applications using SPARQL querying. The approach presented in this paper is novel: no similar RDF sampling approach exist. Additionally, the concept of creating a sample aimed at maximizing SPARQL answer coverage, is unique. We empirically show that the relevance of triples for sampling (a semantic notion) is influenced by the topology of the graph (purely structural), and can be determined without prior knowledge of the queries. Experiments show a significantly higher recall of topology based sampling methods over ran- dom and naive baseline approaches (e.g. up to 90% for Open-BioMed at a sample size of 6%)

    Uncovering the semantics of Wikipedia categories

    Full text link
    The Wikipedia category graph serves as the taxonomic backbone for large-scale knowledge graphs like YAGO or Probase, and has been used extensively for tasks like entity disambiguation or semantic similarity estimation. Wikipedia's categories are a rich source of taxonomic as well as non-taxonomic information. The category 'German science fiction writers', for example, encodes the type of its resources (Writer), as well as their nationality (German) and genre (Science Fiction). Several approaches in the literature make use of fractions of this encoded information without exploiting its full potential. In this paper, we introduce an approach for the discovery of category axioms that uses information from the category network, category instances, and their lexicalisations. With DBpedia as background knowledge, we discover 703k axioms covering 502k of Wikipedia's categories and populate the DBpedia knowledge graph with additional 4.4M relation assertions and 3.3M type assertions at more than 87% and 90% precision, respectively
    corecore