238 research outputs found
A survey of large-scale reasoning on the Web of data
As more and more data is being generated by sensor networks, social media and organizations, the Webinterlinking this wealth of information becomes more complex. This is particularly true for the so-calledWeb of Data, in which data is semantically enriched and interlinked using ontologies. In this large anduncoordinated environment, reasoning can be used to check the consistency of the data and of asso-ciated ontologies, or to infer logical consequences which, in turn, can be used to obtain new insightsfrom the data. However, reasoning approaches need to be scalable in order to enable reasoning over theentire Web of Data. To address this problem, several high-performance reasoning systems, whichmainly implement distributed or parallel algorithms, have been proposed in the last few years. Thesesystems differ significantly; for instance in terms of reasoning expressivity, computational propertiessuch as completeness, or reasoning objectives. In order to provide afirst complete overview of thefield,this paper reports a systematic review of such scalable reasoning approaches over various ontologicallanguages, reporting details about the methods and over the conducted experiments. We highlight theshortcomings of these approaches and discuss some of the open problems related to performing scalablereasoning
Exploiting Parallelism for Hard Problems in Abstract Argumentation
Abstract argumentation framework (AF) is a unifying framework able to encompass a variety of nonmonotonic reasoning approaches, logic programming and computational argumentation. Yet, efficient approaches for most of the decision and enumeration problems associated to AF s are missing, thus potentially limiting the efficacy of argumentation-based approaches in real domains. In this paper, we present an algorithm for enumerating the preferred extensions of abstract argumentation frameworks which exploits parallel computation. To this purpose, the SCC-recursive semantics definition schema is adopted, where extensions are defined at the level of specific sub-frameworks. The algorithm shows significant performance improvements in large frameworks, in terms of number of solutions found and speedup
Recommended from our members
MapReduce based RDF assisted distributed SVM for high throughput spam filtering
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityElectronic mail has become cast and embedded in our everyday lives. Billions of legitimate emails are sent on a daily basis. The widely established underlying infrastructure, its widespread availability as well as its ease of use have all acted as catalysts to such pervasive proliferation. Unfortunately, the same can be alleged about unsolicited bulk email, or rather spam. Various methods, as well as enabling architectures are available to try to mitigate spam permeation. In this respect, this dissertation compliments existing survey work in this area by contributing an extensive literature review of traditional and emerging spam filtering approaches. Techniques, approaches and architectures employed for spam filtering are appraised, critically assessing respective strengths and weaknesses.
Velocity, volume and variety are key characteristics of the spam challenge. MapReduce (M/R) has become increasingly popular as an Internet scale, data intensive processing platform. In the context of machine learning based spam filter training, support vector machine (SVM) based techniques have been proven effective. SVM training is however a computationally intensive process. In this dissertation, a M/R based distributed SVM algorithm for scalable spam filter training, designated MRSMO, is presented. By distributing and processing subsets of the training data across multiple participating computing nodes, the distributed SVM reduces spam filter training time significantly. To mitigate the accuracy degradation introduced by the adopted approach, a Resource Description Framework (RDF) based feedback loop is evaluated. Experimental results demonstrate that this improves the accuracy levels of the distributed SVM beyond the original sequential counterpart.
Effectively exploiting large scale, ‘Cloud’ based, heterogeneous processing capabilities for M/R in what can be considered a non-deterministic environment requires the consideration of a number of perspectives. In this work, gSched, a Hadoop M/R based, heterogeneous aware task to node matching and allocation scheme is designed. Using MRSMO as a baseline, experimental evaluation indicates that gSched improves on the performance of the out-of-the box Hadoop counterpart in a typical Cloud based infrastructure.
The focal contribution to knowledge is a scalable, heterogeneous infrastructure and machine learning based spam filtering scheme, able to capitalize on collaborative accuracy improvements through RDF based, end user feedback. MapReduce based RDF Assisted Distributed SVM for High Throughput Spam Filterin
Scalable RDF compression with MapReduce and HDT
El uso de RDF para publicar datos semánticos se ha incrementado de forma notable en los últimos años. Hoy los datasets son tan grandes y están tan interconectados que su procesamiento presenta problemas de escalabilidad. HDT es una representación compacta de RDF que pretende minimizar el consumo de espacio a la vez que proporciona capacidades de consulta. No obstante, la generación de HDT a partir de formatos en texto de RDF es una tarea costosa en tiempo y recursos. Este trabajo estudia el uso de MapReduce, un framework para el procesamiento distribuido de grandes cantidades de datos, para la tarea de creación de estructuras HDT a partir de RDF, y analiza las mejoras obtenidas tanto en recursos como en tiempo frente a la creación de dichas estructuras en un proceso mono-nodo.Departamento de Informática (Arquitectura y Tecnología de Computadores, Ciencias de la Computación e Inteligencia Artificial, Lenguajes y Sistemas Informáticos)Máster en Investigación en Tecnologías de la Información y las Comunicacione
OWL Reasoners still useable in 2023
In a systematic literature and software review over 100 OWL reasoners/systems
were analyzed to see if they would still be usable in 2023. This has never been
done in this capacity. OWL reasoners still play an important role in knowledge
organisation and management, but the last comprehensive surveys/studies are
more than 8 years old. The result of this work is a comprehensive list of 95
standalone OWL reasoners and systems using an OWL reasoner. For each item,
information on project pages, source code repositories and related
documentation was gathered. The raw research data is provided in a Github
repository for anyone to use
Undefined 0 (0) 1 1 IOS Press Order Matters! Harnessing a World of Orderings for Reasoning over Massive Data
Abstract. More and more applications require real-time processing of massive, dynamically generated, ordered data; order is an essential factor as it reflects recency or relevance. Semantic technologies risk being unable to meet the needs of such applications, as they are not equipped with the appropriate instruments for answering queries over massive, highly dynamic, ordered data sets. In this vision paper, we argue that some data management techniques should be exported to the context of semantic technologies, by integrating ordering with reasoning, and by using methods which are inspired by stream and rank-aware data management. We systematically explore the problem space, and point both to problems which have been successfully approached and to problems which still need fundamental research, in an attempt to stimulate and guide a paradigm shift in semantic technologies
Optimization and inference under fuzzy numerical constraints
Εκτεταμένη έρευνα έχει γίνει στους τομείς της Ικανοποίησης Περιορισμών με
διακριτά (ακέραια) ή πραγματικά πεδία τιμών. Αυτή η έρευνα έχει οδηγήσει σε
πολλαπλές σημασιολογικές περιγραφές, πλατφόρμες και
συστήματα για την περιγραφή σχετικών προβλημάτων με επαρκείς βελτιστοποιήσεις.
Παρά ταύτα, λόγω της ασαφούς φύσης
πραγματικών προβλημάτων ή ελλιπούς μας γνώσης για αυτά, η σαφής μοντελοποίηση
ενός προβλήματος ικανοποίησης περιορισμών δεν είναι πάντα ένα εύκολο ζήτημα ή
ακόμα και η καλύτερη προσέγγιση. Επιπλέον, το πρόβλημα της μοντελοποίησης και
επίλυσης ελλιπούς γνώσης είναι ακόμη δυσκολότερο. Επιπροσθέτως, πρακτικές
απαιτήσεις μοντελοποίησης και μέθοδοι βελτιστοποίησης του χρόνου αναζήτησης
απαιτούν συνήθως ειδικές πληροφορίες για το πεδίο εφαρμογής,
καθιστώντας τη δημιουργία ενός γενικότερου πλαισίου βελτιστοποίησης ένα
ιδιαίτερα δύσκολο πρόβλημα. Στα πλαίσια αυτής της εργασίας θα μελετήσουμε το
πρόβλημα της μοντελοποίησης και αξιοποίησης σαφών, ελλιπών ή ασαφών
περιορισμών, καθώς και πιθανές στρατηγικές βελτιστοποίησης. Καθώς τα
παραδοσιακά προβλήματα ικανοποίησης περιορισμών λειτουργούν βάσει συγκεκριμένων
και προκαθορισμένων κανόνων και σχέσεων, παρουσιάζει ενδιαφέρον η διερεύνηση
στρατηγικών και βελτιστοποιήσεων που θα επιτρέπουν το συμπερασμό νέων ή/και
αποδοτικότερων περιορισμών. Τέτοιοι επιπρόσθετοι κανόνες θα μπορούσαν να
βελτιώσουν τη διαδικασία αναζήτησης μέσω της εφαρμογής αυστηρότερων περιορισμών
και περιορισμού του χώρου αναζήτησης ή να προσφέρουν χρήσιμες πληροφορίες στον
αναλυτή για τη φύση του προβλήματος που
μοντελοποιεί.Extensive research has been done in the areas of Constraint Satisfaction with
discrete/integer
and real domain ranges. Multiple platforms and systems to deal with these kinds
of domains have been developed and appropriately optimized. Nevertheless, due
to the incomplete and possibly vague nature of real-life problems, modeling a
crisp and adequately strict satisfaction problem may not always be easy or even
appropriate. The problem of modeling incomplete
knowledge or solving an incomplete/relaxed representation of a problem is a
much harder issue to tackle. Additionally, practical modeling requirements and
search optimizations require specific domain knowledge in order to be
implemented, making the creation of a more generic optimization framework an
even harder problem.In this thesis, we will study the problem of modeling and
utilizing incomplete and fuzzy constraints, as well as possible optimization
strategies. As constraint satisfaction problems usually contain hard-coded
constraints based on specific problem and domain knowledge, we will investigate
whether strategies and generic heuristics exist for inferring new constraint
rules. Additional rules could optimize the search process by implementing
stricter constraints and thus pruning the search space or even provide useful
insight to the researcher concerning the nature of the investigated problem
From fuzzy to annotated semantic web languages
The aim of this chapter is to present a detailed, selfcontained and comprehensive account of the state of the art in representing and reasoning with fuzzy knowledge in Semantic Web Languages such as triple languages RDF/RDFS, conceptual languages of the OWL 2 family and rule languages. We further show how one may generalise them to so-called annotation domains, that cover also e.g. temporal and provenance extensions
- …