Search CORE

547 research outputs found

A systematic literature review on Wikidata

Author: García Barriocanal María Elena
Mora Cantallops Marçal
Sánchez Alonso Salvador
Publication venue: Emerald Publishing Limited
Publication date: 01/07/2019
Field of study

To review the current status of research on Wikidata and, in particular, of articles that either describe applications of Wikidata or provide empirical evidence, in order to uncover the topics of interest, the fields that are benefiting from its applications and which researchers and institutions are leading the work

e_Buah - Biblioteca Digital de la Universidad de Alcalá

Clover Quiz: a trivia game powered by DBpedia

Author: Vega Gorgojo Guillermo
Publication venue: 'IOS Press'
Publication date: 01/01/2018
Field of study

Producción CientíficaDBpedia is a large-scale and multilingual knowledge base generated by extracting structured data from Wikipedia. There have been several attempts to use DBpedia to generate questions for trivia games, but these initiatives have not succeeded to produce large, varied, and entertaining question sets. Moreover, latency is too high for an interactive game if questions are created by submitting live queries to the public DBpedia endpoint. These limitations are addressed in Clover Quiz, a turn-based multiplayer trivia game for Android devices with more than 200K multiple choice questions (in English and Spanish) about different domains generated out of DBpedia. Questions are created off-line through a data extraction pipeline and a versatile template-based mechanism. A back-end server manages the question set and the associated images, while a mobile app has been developed and released in Google Play. The game is available free of charge and has been downloaded by more than 5K users since the game was released in March 2017. Players have answered more than 614K questions and the overall rating of the game is 4.3 out of 5.0. Therefore, Clover Quiz demonstrates the advantages of semantic technologies for collecting data and automating the generation of multiple choice questions in a scalable way.Ministerio de Economía, Industria y Competitividad (Projects TIN2017-85179-C3-2-R and RESET TIN2014-53199-C3-2

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Documental de la Universidad de Valladolid

Creation, Enrichment and Application of Knowledge Graphs

Author: Gottschalk Simon
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 01/01/2021
Field of study

The world is in constant change, and so is the knowledge about it. Knowledge-based systems - for example, online encyclopedias, search engines and virtual assistants - are thus faced with the constant challenge of collecting this knowledge and beyond that, to understand it and make it accessible to their users. Only if a knowledge-based system is capable of this understanding - that is, it is capable of more than just reading a collection of words and numbers without grasping their semantics - it can recognise relevant information and make it understandable to its users. The dynamics of the world play a unique role in this context: Events of various kinds which are relevant to different communities are shaping the world, with examples ranging from the coronavirus pandemic to the matches of a local football team. Vital questions arise when dealing with such events: How to decide which events are relevant, and for whom? How to model these events, to make them understood by knowledge-based systems? How is the acquired knowledge returned to the users of these systems? A well-established concept for making knowledge understandable by knowledge-based systems are knowledge graphs, which contain facts about entities (persons, objects, locations, ...) in the form of graphs, represent relationships between these entities and make the facts understandable by means of ontologies. This thesis considers knowledge graphs from three different perspectives: (i) Creation of knowledge graphs: Even though the Web offers a multitude of sources that provide knowledge about the events in the world, the creation of an event-centric knowledge graph requires recognition of such knowledge, its integration across sources and its representation. (ii) Knowledge graph enrichment: Knowledge of the world seems to be infinite, and it seems impossible to grasp it entirely at any time. Therefore, methods that autonomously infer new knowledge and enrich the knowledge graphs are of particular interest. (iii) Knowledge graph interaction: Even having all knowledge of the world available does not have any value in itself; in fact, there is a need to make it accessible to humans. Based on knowledge graphs, systems can provide their knowledge with their users, even without demanding any conceptual understanding of knowledge graphs from them. For this to succeed, means for interaction with the knowledge are required, hiding the knowledge graph below the surface. In concrete terms, I present EventKG - a knowledge graph that represents the happenings in the world in 15 languages - as well as Tab2KG - a method for understanding tabular data and transforming it into a knowledge graph. For the enrichment of knowledge graphs without any background knowledge, I propose HapPenIng, which infers missing events from the descriptions of related events. I demonstrate means for interaction with knowledge graphs at the example of two web-based systems (EventKG+TL and EventKG+BT) that enable users to explore the happenings in the world as well as the most relevant events in the lives of well-known personalities.Die Welt befindet sich im steten Wandel, und mit ihr das Wissen über die Welt. Wissensbasierte Systeme - seien es Online-Enzyklopädien, Suchmaschinen oder Sprachassistenten - stehen somit vor der konstanten Herausforderung, dieses Wissen zu sammeln und darüber hinaus zu verstehen, um es so Menschen verfügbar zu machen. Nur wenn ein wissensbasiertes System in der Lage ist, dieses Verständnis aufzubringen - also zu mehr in der Lage ist, als auf eine unsortierte Ansammlung von Wörtern und Zahlen zurückzugreifen, ohne deren Bedeutung zu erkennen -, kann es relevante Informationen erkennen und diese seinen Nutzern verständlich machen. Eine besondere Rolle spielt hierbei die Dynamik der Welt, die von Ereignissen unterschiedlichster Art geformt wird, die für unterschiedlichste Bevölkerungsgruppe relevant sind; Beispiele hierfür erstrecken sich von der Corona-Pandemie bis hin zu den Spielen lokaler Fußballvereine. Doch stellen sich hierbei bedeutende Fragen: Wie wird die Entscheidung getroffen, ob und für wen derlei Ereignisse relevant sind? Wie sind diese Ereignisse zu modellieren, um von wissensbasierten Systemen verstanden zu werden? Wie wird das angeeignete Wissen an die Nutzer dieser Systeme zurückgegeben? Ein bewährtes Konzept, um wissensbasierten Systemen das Wissen verständlich zu machen, sind Wissensgraphen, die Fakten über Entitäten (Personen, Objekte, Orte, ...) in der Form von Graphen sammeln, Zusammenhänge zwischen diesen Entitäten darstellen, und darüber hinaus anhand von Ontologien verständlich machen. Diese Arbeit widmet sich der Betrachtung von Wissensgraphen aus drei aufeinander aufbauenden Blickwinkeln: (i) Erstellung von Wissensgraphen: Auch wenn das Internet eine Vielzahl an Quellen anbietet, die Wissen über Ereignisse in der Welt bereithalten, so erfordert die Erstellung eines ereigniszentrierten Wissensgraphen, dieses Wissen zu erkennen, miteinander zu verbinden und zu repräsentieren. (ii) Anreicherung von Wissensgraphen: Das Wissen über die Welt scheint schier unendlich und so scheint es unmöglich, dieses je vollständig (be)greifen zu können. Von Interesse sind also Methoden, die selbstständig das vorhandene Wissen erweitern. (iii) Interaktion mit Wissensgraphen: Selbst alles Wissen der Welt bereitzuhalten, hat noch keinen Wert in sich selbst, vielmehr muss dieses Wissen Menschen verfügbar gemacht werden. Basierend auf Wissensgraphen, können wissensbasierte Systeme Nutzern ihr Wissen darlegen, auch ohne von diesen ein konzeptuelles Verständis von Wissensgraphen abzuverlangen. Damit dies gelingt, sind Möglichkeiten der Interaktion mit dem gebotenen Wissen vonnöten, die den genutzten Wissensgraphen unter der Oberfläche verstecken. Konkret präsentiere ich EventKG - einen Wissensgraphen, der Ereignisse in der Welt repräsentiert und in 15 Sprachen verfügbar macht, sowie Tab2KG - eine Methode, um in Tabellen enthaltene Daten anhand von Hintergrundwissen zu verstehen und in Wissensgraphen zu wandeln. Zur Anreicherung von Wissensgraphen ohne weiteres Hintergrundwissen stelle ich HapPenIng vor, das fehlende Ereignisse aus den vorliegenden Beschreibungen ähnlicher Ereignisse inferiert. Interaktionsmöglichkeiten mit Wissensgraphen demonstriere ich anhand zweier web-basierter Systeme (EventKG+TL und EventKG+BT), die Nutzern auf einfache Weise die Exploration von Geschehnissen in der Welt sowie der wichtigsten Ereignisse in den Leben bekannter Persönlichkeiten ermöglichen

Institutionelles Repositorium der Leibniz Universität Hannover

Enabling Complex Semantic Queries to Bioinformatics Databases through Intuitive Search Over Data

Author: Sima Ana Claudia
Publication venue: Université de Lausanne, Faculté de biologie et médecine
Publication date: 26/10/2020
Field of study

Data integration promises to be one of the main catalysts in enabling new insights to be drawn from the wealth of biological data already available publicly. However, the heterogene- ity of the existing data sources still poses significant challenges for achieving interoperability among biological databases. Furthermore, merely solving the technical challenges of data in- tegration, for example through the use of common data representation formats, leaves open the larger problem. Namely, the steep learning curve required for understanding the data models of each public source, as well as the technical language through which the sources can be queried and joined. As a consequence, most of the available biological data remain practically unexplored today. In this thesis, we address these problems jointly, by first introducing an ontology-based data integration solution in order to mitigate the data source heterogeneity problem. We illustrate through the concrete example of Bgee, a gene expression data source, how relational databases can be exposed as virtual Resource Description Framework (RDF) graphs, through relational-to-RDF mappings. This has the important advantage that the original data source can remain unmodified, while still becoming interoperable with external RDF sources. We complement our methods with applied case studies designed to guide domain experts in formulating expressive federated queries targeting the integrated data across the domains of evolutionary relationships and gene expression. More precisely, we introduce two com- parative analyses, first within the same domain (using orthology data from multiple, inter- operable, data sources) and second across domains, in order to study the relation between expression change and evolution rate following a duplication event. Finally, in order to bridge the semantic gap between users and data, we design and im- plement Bio-SODA, a question answering system over domain knowledge graphs, that does not require training data for translating user questions to SPARQL. Bio-SODA uses a novel ranking approach that combines syntactic and semantic similarity, while also incorporating node centrality metrics to rank candidate matches for a given user question. Our results in testing Bio-SODA across several real-world databases that span multiple domains (both within and outside bioinformatics) show that it can answer complex, multi-fact queries, be- yond the current state-of-the-art in the more well-studied open-domain question answering. -- L’intégration des données promet d’être l’un des principaux catalyseurs permettant d’extraire des nouveaux aperçus de la richesse des données biologiques déjà disponibles publiquement. Cependant, l’hétérogénéité des sources de données existantes pose encore des défis importants pour parvenir à l’interopérabilité des bases de données biologiques. De plus, en surmontant seulement les défis techniques de l’intégration des données, par exemple grâce à l’utilisation de formats standard de représentation de données, on laisse ouvert un problème encore plus grand. À savoir, la courbe d’apprentissage abrupte nécessaire pour comprendre la modéli- sation des données choisie par chaque source publique, ainsi que le langage technique par lequel les sources peuvent être interrogés et jointes. Par conséquent, la plupart des données biologiques publiquement disponibles restent pratiquement inexplorés aujourd’hui. Dans cette thèse, nous abordons l’ensemble des deux problèmes, en introduisant d’abord une solution d’intégration de données basée sur ontologies, afin d’atténuer le problème d’hété- rogénéité des sources de données. Nous montrons, à travers l’exemple de Bgee, une base de données d’expression de gènes, une approche permettant les bases de données relationnelles d’être publiés sous forme de graphes RDF (Resource Description Framework) virtuels, via des correspondances relationnel-vers-RDF (« relational-to-RDF mappings »). Cela présente l’important avantage que la source de données d’origine peut rester inchangé, tout en de- venant interopérable avec les sources RDF externes. Nous complétons nos méthodes avec des études de cas appliquées, conçues pour guider les experts du domaine dans la formulation de requêtes fédérées expressives, ciblant les don- nées intégrées dans les domaines des relations évolutionnaires et de l’expression des gènes. Plus précisément, nous introduisons deux analyses comparatives, d’abord dans le même do- maine (en utilisant des données d’orthologie provenant de plusieurs sources de données in- teropérables) et ensuite à travers des domaines interconnectés, afin d’étudier la relation entre le changement d’expression et le taux d’évolution suite à une duplication de gène. Enfin, afin de mitiger le décalage sémantique entre les utilisateurs et les données, nous concevons et implémentons Bio-SODA, un système de réponse aux questions sur des graphes de connaissances domaine-spécifique, qui ne nécessite pas de données de formation pour traduire les questions des utilisateurs vers SPARQL. Bio-SODA utilise une nouvelle ap- proche de classement qui combine la similarité syntactique et sémantique, tout en incorporant des métriques de centralité des nœuds, pour classer les possibles candidats en réponse à une question utilisateur donnée. Nos résultats suite aux tests effectués en utilisant Bio-SODA sur plusieurs bases de données à travers plusieurs domaines (tantôt liés à la bioinformatique qu’extérieurs) montrent que Bio-SODA réussit à répondre à des questions complexes, en- gendrant multiples entités, au-delà de l’état actuel de la technique en matière de systèmes de réponses aux questions sur les données structures, en particulier graphes de connaissances

Serveur académique lausannois

EVALITA Evaluation of NLP and Speech Tools for Italian Proceedings of the Final Workshop

Author: Basile Pierpaolo
Cutugno Franco
Nissim Malvina
Patti Viviana
Pierpaolo Basile Franco Cutugno, Malvina Nissim, Viviana Patti, Rachele Sprugnoli
Sprugnoli Rachele
Publication venue: place:Torino
Publication date: 01/01/2016
Field of study

Editor of the proceedings of EVALITA 2016

Archivio istituzionale della Ricerca - Università degli Studi di Parma

PubliCatt

Good Applications for Crummy Entity Linkers? The Case of Corpus Selection in Digital Humanities

Author: Beelen Kaspar
Kamps Jaap
Marx Maarten
Olieman Alex
van Lange Milan
Publication venue
Publication date: 01/01/2017
Field of study

Over the last decade we have made great progress in entity linking (EL) systems, but performance may vary depending on the context and, arguably, there are even principled limitations preventing a "perfect" EL system. This also suggests that there may be applications for which current "imperfect" EL is already very useful, and makes finding the "right" application as important as building the "right" EL system. We investigate the Digital Humanities use case, where scholars spend a considerable amount of time selecting relevant source texts. We developed WideNet; a semantically-enhanced search tool which leverages the strengths of (imperfect) EL without getting in the way of its expert users. We evaluate this tool in two historical case-studies aiming to collect a set of references to historical periods in parliamentary debates from the last two decades; the first targeted the Dutch Golden Age, and the second World War II. The case-studies conclude with a critical reflection on the utility of WideNet for this kind of research, after which we outline how such a real-world application can help to improve EL technology in general.Comment: Accepted for presentation at SEMANTiCS '1

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

CBRecSys 2015. New Trends on Content-Based Recommender Systems:Proceedings of the 2nd Workshop on New Trends on Content-Based Recommender Systems co-located with 9th ACM Conference on Recommender Systems (RecSys 2015)

Author
Publication venue: CEUR Workshop Proceedings
Publication date: 20/09/2015
Field of study

VBN

Semantic Approach for Discovery and Visualization of Academic Information Structured with OAI-PMH

Author: ALVARADO URIBE JOANNA
ALVARADO URIBE JOANNA
BECERRIL GARCIA ARIANNA
BECERRIL GARCIA ARIANNA
GONZALEZ MENDOZA MIGUEL
GONZALEZ MENDOZA MIGUEL
LOZANO ESPINOSA RAFAEL
LOZANO ESPINOSA RAFAEL
MOLINA ESPINOSA JOSE MARTIN
MOLINA ESPINOSA JOSE MARTIN
MOLINA ESPINOSA JOSE MARTIN
MOLINA ESPINOSA JOSE MARTIN
Publication venue: Acta Polytechnica Hungarica
Publication date: 01/01/2017
Field of study

There are different channels to communicate the results of a scientific research; however, several research communities state that the Open Access (OA) is the future of acad emic publishing. These Open Ac cess Platforms have adopted OAI - PMH (Open Archives Initiative - the Protocol for Metadata Harvesting) as a standard for communication and interoperability. Nevertheless, it is significant to highlight that the open source know ledge discovery services based on an index of OA have not been developed. Therefore, it is necessary to address Knowledge Discovery (KD) within these platforms aiming at studen ts, teachers and/ or researchers , to recover both , the resources requested and th e resources that are not explicitly requested – which are also appropriate . This objective represents an important issue fo r structured resources under OAI - PMH. This fact is caused because interoperability with other developments carried out outside their implementation environment is generally not a priority (Level 1 "Shared term definitions"). It is here , where the Semantic Web (SW) beco mes a cornerstone of this work. Consequently, we propose OntoOAIV, a semantic approach for the selective knowledge disco very an d visu alization into structured information with OAI - PMH, focused on supporting the activities of scientific or academic research for a specific user. Because of the academic nature of the structured resources with OAI - PMH, the field of application chosen is the context information of a student. Finally, in order to validate the proposed approach, we use the RUDAR (Roskilde University Digital Archive) and REDALYC (Red de Revistas Científicas de América Latina y el Caribe, España y Portugal) repositor ies, which imple ment the OAI - PMH protocol , as well as one s tudent profile for carrying out KD

Repositorio Institucional de la Universidad Autónoma del Estado de México

Collaborative recommendations with content-based filters for cultural activities via a scalable event distribution platform

Author: Bannier Stijn
Coppens Sam
De Pessemier Toon
Geebelen Kristof
Mannens Erik
Martens Luc
Vanhecke Kris
Vleugels Chris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Nowadays, most people have limited leisure time and the offer of (cultural) activities to spend this time is enormous. Consequently, picking the most appropriate events becomes increasingly difficult for end-users. This complexity of choice reinforces the necessity of filtering systems that assist users in finding and selecting relevant events. Whereas traditional filtering tools enable e.g. the use of keyword-based or filtered searches, innovative recommender systems draw on user ratings, preferences, and metadata describing the events. Existing collaborative recommendation techniques, developed for suggesting web-shop products or audio-visual content, have difficulties with sparse rating data and can not cope at all with event-specific restrictions like availability, time, and location. Moreover, aggregating, enriching, and distributing these events are additional requisites for an optimal communication channel. In this paper, we propose a highly-scalable event recommendation platform which considers event-specific characteristics. Personal suggestions are generated by an advanced collaborative filtering algorithm, which is more robust on sparse data by extending user profiles with presumable future consumptions. The events, which are described using an RDF/OWL representation of the EventsML-G2 standard, are categorized and enriched via smart indexing and open linked data sets. This metadata model enables additional content-based filters, which consider event-specific characteristics, on the recommendation list. The integration of these different functionalities is realized by a scalable and extendable bus architecture. Finally, focus group conversations were organized with external experts, cultural mediators, and potential end-users to evaluate the event distribution platform and investigate the possible added value of recommendations for cultural participation

Ghent University Academic Bibliography