Search CORE

48 research outputs found

FootbOWL: Using a generic ontology of football competition for planning match summaries

Author: Bouayad-Agha Nadjet
Casamayor Gerard
Díez Fernando
López Hernández Sergio
Wanner Leo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-21034-1_16Proceedings of 8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29-June 2, 2011We present a two-layer OWL ontology-based Knowledge Base (KB) that allows for flexible content selection and discourse structuring in Natural Language text Generation (NLG) and discuss its use for these two tasks. The first layer of the ontology contains an application-independent base ontology. It models the domain and was not designed with NLG in mind. The second layer, which is added on top of the base ontology, models entities and events that can be inferred from the base ontology, including inferable logico-semantic relations between individuals. The nodes in the KB are weighted according to learnt models of content selection, such that a subset of them can be extracted. The extraction is done using templates that also consider semantic relations between the nodes and a simple user profile. The discourse structuring submodule maps the semantic relations to discourse relations and forms discourse units to then arrange them into a coherent discourse graph. The approach is illustrated and evaluated on a KB that models the First Spanish Football League

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Biblos-e Archivo

Algorithmes passant à l’échelle pour la gestion de données du Web sémantique sur les platformes cloud

Author: Zampetakis Stamatis
Publication venue: HAL CCSD
Publication date: 21/09/2015
Field of study

In order to build smart systems, where machines are able to reason exactly like humans, data with semantics is a major requirement. This need led to the advent of the Semantic Web, proposing standard ways for representing and querying data with semantics. RDF is the prevalent data model used to describe web resources, and SPARQL is the query language that allows expressing queries over RDF data. Being able to store and query data with semantics triggered the development of many RDF data management systems. The rapid evolution of the Semantic Web provoked the shift from centralized data management systems to distributed ones. The first systems to appear relied on P2P and client-server architectures, while recently the focus moved to cloud computing.Cloud computing environments have strongly impacted research and development in distributed software platforms. Cloud providers offer distributed, shared-nothing infrastructures that may be used for data storage and processing. The main features of cloud computing involve scalability, fault-tolerance, and elastic allocation of computing and storage resources following the needs of the users.This thesis investigates the design and implementation of scalable algorithms and systems for cloud-based Semantic Web data management. In particular, we study the performance and cost of exploiting commercial cloud infrastructures to build Semantic Web data repositories, and the optimization of SPARQL queries for massively parallel frameworks.First, we introduce the basic concepts around Semantic Web and the main components and frameworks interacting in massively parallel cloud-based systems. In addition, we provide an extended overview of existing RDF data management systems in the centralized and distributed settings, emphasizing on the critical concepts of storage, indexing, query optimization, and infrastructure. Second, we present AMADA, an architecture for RDF data management using public cloud infrastructures. We follow the Software as a Service (SaaS) model, where the complete platform is running in the cloud and appropriate APIs are provided to the end-users for storing and retrieving RDF data. We explore various storage and querying strategies revealing pros and cons with respect to performance and also to monetary cost, which is a important new dimension to consider in public cloud services. Finally, we present CliqueSquare, a distributed RDF data management system built on top of Hadoop, incorporating a novel optimization algorithm that is able to produce massively parallel plans for SPARQL queries. We present a family of optimization algorithms, relying on n-ary (star) equality joins to build flat plans, and compare their ability to find the flattest possibles. Inspired by existing partitioning and indexing techniques we present a generic storage strategy suitable for storing RDF data in HDFS (Hadoop’s Distributed File System). Our experimental results validate the efficiency and effectiveness of the optimization algorithm demonstrating also the overall performance of the system.Afin de construire des systèmes intelligents, où les machines sont capables de raisonner exactement comme les humains, les données avec sémantique sont une exigence majeure. Ce besoin a conduit à l’apparition du Web sémantique, qui propose des technologies standards pour représenter et interroger les données avec sémantique. RDF est le modèle répandu destiné à décrire de façon formelle les ressources Web, et SPARQL est le langage de requête qui permet de rechercher, d’ajouter, de modifier ou de supprimer des données RDF. Être capable de stocker et de rechercher des données avec sémantique a engendré le développement des nombreux systèmes de gestion des données RDF.L’évolution rapide du Web sémantique a provoqué le passage de systèmes de gestion des données centralisées à ceux distribués. Les premiers systèmes étaient fondés sur les architectures pair-à-pair et client-serveur, alors que récemment l’attention se porte sur le cloud computing.Les environnements de cloud computing ont fortement impacté la recherche et développement dans les systèmes distribués. Les fournisseurs de cloud offrent des infrastructures distribuées autonomes pouvant être utilisées pour le stockage et le traitement des données. Les principales caractéristiques du cloud computing impliquent l’évolutivité́, la tolérance aux pannes et l’allocation élastique des ressources informatiques et de stockage en fonction des besoins des utilisateurs.Cette thèse étudie la conception et la mise en œuvre d’algorithmes et de systèmes passant à l’échelle pour la gestion des données du Web sémantique sur des platformes cloud. Plus particulièrement, nous étudions la performance et le coût d’exploitation des services de cloud computing pour construire des entrepôts de données du Web sémantique, ainsi que l’optimisation de requêtes SPARQL pour les cadres massivement parallèles.Tout d’abord, nous introduisons les concepts de base concernant le Web sémantique et les principaux composants des systèmes fondés sur le cloud. En outre, nous présentons un aperçu des systèmes de gestion des données RDF (centralisés et distribués), en mettant l’accent sur les concepts critiques de stockage, d’indexation, d’optimisation des requêtes et d’infrastructure.Ensuite, nous présentons AMADA, une architecture de gestion de données RDF utilisant les infrastructures de cloud public. Nous adoptons le modèle de logiciel en tant que service (software as a service - SaaS), où la plateforme réside dans le cloud et des APIs appropriées sont mises à disposition des utilisateurs, afin qu’ils soient capables de stocker et de récupérer des données RDF. Nous explorons diverses stratégies de stockage et d’interrogation, et nous étudions leurs avantages et inconvénients au regard de la performance et du coût monétaire, qui est une nouvelle dimension importante à considérer dans les services de cloud public.Enfin, nous présentons CliqueSquare, un système distribué de gestion des données RDF basé sur Hadoop. CliqueSquare intègre un nouvel algorithme d’optimisation qui est capable de produire des plans massivement parallèles pour des requêtes SPARQL. Nous présentons une famille d’algorithmes d’optimisation, s’appuyant sur les équijointures n- aires pour générer des plans plats, et nous comparons leur capacité à trouver les plans les plus plats possibles. Inspirés par des techniques de partitionnement et d’indexation existantes, nous présentons une stratégie de stockage générique appropriée au stockage de données RDF dans HDFS (Hadoop Distributed File System). Nos résultats expérimentaux valident l’effectivité et l’efficacité de l’algorithme d’optimisation démontrant également la performance globale du système

HAL-CentraleSupelec

Thèses en Ligne

INRIA a CCSD electronic archive server

Theses.fr

HAL-Rennes 1

ELLIS: Interactive exploration of Linked Data on the level of induced schema patterns

Author: Gottron Thomas
Knauf Malte
Schaible Johann
Scherp Ansgar
Publication venue: Anissaras
Publication date: 01/01/2016
Field of study

We present ELLIS, a demo to browse the Linked Data cloud on the level of induced schema patterns. To this end, we define schema-level patterns of RDF types and properties to identify how entities described by type sets are connected by property sets. We show that schema-level patterns can be aggregated and extracted from large Linked Data sets using efficient algorithms for mining frequent item sets. A subsequent visualisation of such patterns enables users to quickly understand which type of information is modelled on the Linked Data cloud and how this information is interconnected

Stirling Online Research Repository (RIOXX)

Stirling Online Research Repository

LiteMat: a scalable, cost-efficient inference encoding scheme for large RDF graphs

Author: Amann Bernd
Curé Olivier
Naacke Hubert
Randriamalala Tendry
Publication venue
Publication date: 12/10/2015
Field of study

The number of linked data sources and the size of the linked open data graph keep growing every day. As a consequence, semantic RDF services are more and more confronted with various "big data" problems. Query processing in the presence of inferences is one them. For instance, to complete the answer set of SPARQL queries, RDF database systems evaluate semantic RDFS relationships (subPropertyOf, subClassOf) through time-consuming query rewriting algorithms or space-consuming data materialization solutions. To reduce the memory footprint and ease the exchange of large datasets, these systems generally apply a dictionary approach for compressing triple data sizes by replacing resource identifiers (IRIs), blank nodes and literals with integer values. In this article, we present a structured resource identification scheme using a clever encoding of concepts and property hierarchies for efficiently evaluating the main common RDFS entailment rules while minimizing triple materialization and query rewriting. We will show how this encoding can be computed by a scalable parallel algorithm and directly be implemented over the Apache Spark framework. The efficiency of our encoding scheme is emphasized by an evaluation conducted over both synthetic and real world datasets.Comment: 8 pages, 1 figur

arXiv.org e-Print Archive

Crossref

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

TVPulse: Improvements on detecting TV highlightsin Social Networks using metadata and semanticsimilarity

Author: Antunes Mário
Gomes Diogo
Vilaça Afonso
Publication venue: ATNoG - Aveiro Telecommunications and Networking Group
Publication date: 01/11/2015
Field of study

Sharing live experiences in social networks is agrowing trend. That includes posting comments and sentimentsabout TV programs. Automatic detection of messages withcontents related to TV opens new opportunities for the industryof entertainment information.This paper describes a system that detects TV highlights in oneof the most important social networks - Twitter. Combining Twit-ter's messages and information from an Electronic ProgrammingGuide (EPG) enriched with external metadata we built a modelthat matches tweets with TV programs with an accuracy over80{\%}. Our model required the construction of semantic profilesfor the Portuguese language. These semantic profiles are usedto identify the most representative tweets as highlights of a TVprogram. Measuring semantic similarity with those tweets it ispossible to gather other messages within the same context. Thisstrategy improves the recall of the detection. In addition wedeveloped a method to automatically gather other related webresources, namely Youtube videos. TVPulse: Improvements on detecting TV highlights in Social Networks using metadata and semantic similarity

Repositório Institucional da Universidade de Aveiro

Developing a Benchmark Suite for Semantic Web Data from Existing Workflows

Author: Charalambidis Angelos
de Boer Victor
Digles Daniela
Konstantopoulos Stasinos
Mouchakis Giannis
Siebes Ronald
Soiland-Reyes Stian
Troumpoukis Antonis
Publication venue
Publication date: 01/01/2016
Field of study

This paper presents work in progress towards developing a new benchmark for federated query processing systems. Unlike other popular benchmarks, our queryset is not driven by technical evaluation, but is derived from workflows established by the pharmacology community. The value of this queryset is that it is realistic but at the same time it comprises complex queries that test all features of modern query processing systems

VU Research Portal

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

The University of Manchester - Institutional Repository

Solutions for system analysis and information support of the various activities in the Arctic

Author: Avdeev Alexey
Lomov Pavel
Oleynik Andrey
Shemyakin Alexey
Publication venue: Masaryk Univerzity
Publication date: 01/06/2017
Field of study

Comprehensive use of data and knowledge obtained within different disciplines is necessary for the scientific substantiation of activities in the Arctic zone and for a system analysis of the possible consequences of this activity. Information resources created so far allow the access to a variety of data on the Arctic. The authors propose the solution for task of data consistency ensuring in the field of combined presentation and use of data and knowledge of interdisciplinary research. The proposed solution is based on the joint use of relational database and ontology. The developed structure and mechanisms of the database maintenance provide a uniform representation of the information about results of the researches executed in the framework of various disciplines. The ontology is a high-level global schema of the information system and it provides a dictionary that is used to formulate a database query in terms of a subject domain. In this work, ontology is implemented as a system of small fragments - ontology design patterns. The patterns use makes it possible to perform efficient preliminary database indexing, which ensures faster execution of user queries

Masaryk University Journals / Časopisy Masarykovy univerzity

SPARQLGraph: a web-based platform for graphically querying biological Semantic Web databases

Author: Dominik Schweiger
Stephan Pabinger
Zlatko Trajanoski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Crossref