124 research outputs found

    Distributed Search in Semantic Web Service Discovery

    Get PDF
    This thesis presents a framework for semantic Web Service discovery using descriptive (non-functional) service characteristics in a large-scale, multi-domain setting. The framework uses Web Ontology Language for Services (OWL-S) to design a template for describing non-functional service parameters in a way that facilitates service discovery, and presents a layered scheme for organizing ontologies used in service description. This service description scheme serves as a core for desigining the four main functions of a service directory: a template-based user interface, semantic query expansion algorithms, a two-level indexing scheme that combines Bloom filters with a Distributed Hash Table, and a distributed approach for storing service description. The service directory is, in turn, implemented as an extension of the Open Service Discovery Architecture. The search algorithms presented in this thesis are designed to maximize precision and completeness of service discovery, while the distributed design of the directory allows individual administrative domains to retain a high degree of independence and maintain access control to information about their services

    RDF Data Indexing and Retrieval: A survey of Peer-to-Peer based solutions

    Get PDF
    The Semantic Web enables the possibility to model, create and query resources found on the Web. Enabling the full potential of its technologies at the Internet level requires infrastructures that can cope with scalability challenges and support various types of queries. The attractive features of the Peer-to-Peer (P2P) communication model such as decentralization, scalability, fault-tolerance seems to be a natural solution to deal with these challenges. Consequently, the combination of the Semantic Web and the P2P model can be a highly innovative attempt to harness the strengths of both technologies and come up with a scalable infrastructure for RDF data storage and retrieval. In this respect, this survey details the research works that adopt this combination and gives an insight on how to deal with the RDF data at the indexing and querying levels.Le Web SĆ©mantique permet de modĆ©liser, crĆ©er et faire des requĆŖtes sur les ressources disponibles sur le Web. Afin de permettre Ć  ses technologies d'exploiter leurs potentiels Ć  l'Ć©chelle de l'Internet, il est nĆ©cessaire qu'elles reposent sur des infrastructures qui puissent passer Ć  l'Ć©chelle ainsi que de rĆ©pondre aux exigences d'expressivitĆ© des types de requĆŖtes qu'elles offrent. Les bonnes propriĆ©tĆ©s qu'offrent les derniĆØres gĆ©nĆ©rations de systĆØmes pair-Ć - pair en termes de dĆ©centralisation, de tolĆ©rance aux pannes ainsi que de passage Ć  l'Ć©chelle en font d'eux des candidats prometteurs. La combinaison du modĆØle pair-Ć -pair et des technologies du Web SĆ©mantique est une tentative innovante ayant pour but de fournir une infrastructure capable de passer Ć  l'Ć©chelle et pouvant stocker et rechercher des donnĆ©es de type RDF. Dans ce contexte, ce rapport prĆ©sente un Ć©tat de l'art et discute en dĆ©tail des travaux autour de systĆØmes pair-Ć -pair qui traitent des donnĆ©es de type RDF Ć  large Ć©chelle. Nous dĆ©taillons leurs mĆ©canismes d'indexation de donnĆ©es ainsi que le traitement des divers types de requĆŖtes offerts

    Knowledge extraction from unstructured data and classification through distributed ontologies

    Get PDF
    The World Wide Web has changed the way humans use and share any kind of information. The Web removed several access barriers to the information published and has became an enormous space where users can easily navigate through heterogeneous resources (such as linked documents) and can easily edit, modify, or produce them. Documents implicitly enclose information and relationships among them which become only accessible to human beings. Indeed, the Web of documents evolved towards a space of data silos, linked each other only through untyped references (such as hypertext references) where only humans were able to understand. A growing desire to programmatically access to pieces of data implicitly enclosed in documents has characterized the last efforts of the Web research community. Direct access means structured data, thus enabling computing machinery to easily exploit the linking of different data sources. It has became crucial for the Web community to provide a technology stack for easing data integration at large scale, first structuring the data using standard ontologies and afterwards linking them to external data. Ontologies became the best practices to define axioms and relationships among classes and the Resource Description Framework (RDF) became the basic data model chosen to represent the ontology instances (i.e. an instance is a value of an axiom, class or attribute). Data becomes the new oil, in particular, extracting information from semi-structured textual documents on the Web is key to realize the Linked Data vision. In the literature these problems have been addressed with several proposals and standards, that mainly focus on technologies to access the data and on formats to represent the semantics of the data and their relationships. With the increasing of the volume of interconnected and serialized RDF data, RDF repositories may suffer from data overloading and may become a single point of failure for the overall Linked Data vision. One of the goals of this dissertation is to propose a thorough approach to manage the large scale RDF repositories, and to distribute them in a redundant and reliable peer-to-peer RDF architecture. The architecture consists of a logic to distribute and mine the knowledge and of a set of physical peer nodes organized in a ring topology based on a Distributed Hash Table (DHT). Each node shares the same logic and provides an entry point that enables clients to query the knowledge base using atomic, disjunctive and conjunctive SPARQL queries. The consistency of the results is increased using data redundancy algorithm that replicates each RDF triple in multiple nodes so that, in the case of peer failure, other peers can retrieve the data needed to resolve the queries. Additionally, a distributed load balancing algorithm is used to maintain a uniform distribution of the data among the participating peers by dynamically changing the key space assigned to each node in the DHT. Recently, the process of data structuring has gained more and more attention when applied to the large volume of text information spread on the Web, such as legacy data, news papers, scientific papers or (micro-)blog posts. This process mainly consists in three steps: \emph{i)} the extraction from the text of atomic pieces of information, called named entities; \emph{ii)} the classification of these pieces of information through ontologies; \emph{iii)} the disambigation of them through Uniform Resource Identifiers (URIs) identifying real world objects. As a step towards interconnecting the web to real world objects via named entities, different techniques have been proposed. The second objective of this work is to propose a comparison of these approaches in order to highlight strengths and weaknesses in different scenarios such as scientific and news papers, or user generated contents. We created the Named Entity Recognition and Disambiguation (NERD) web framework, publicly accessible on the Web (through REST API and web User Interface), which unifies several named entity extraction technologies. Moreover, we proposed the NERD ontology, a reference ontology for comparing the results of these technologies. Recently, the NERD ontology has been included in the NIF (Natural language processing Interchange Format) specification, part of the Creating Knowledge out of Interlinked Data (LOD2) project. Summarizing, this dissertation defines a framework for the extraction of knowledge from unstructured data and its classification via distributed ontologies. A detailed study of the Semantic Web and knowledge extraction fields is proposed to define the issues taken under investigation in this work. Then, it proposes an architecture to tackle the single point of failure issue introduced by the RDF repositories spread within the Web. Although the use of ontologies enables a Web where data is structured and comprehensible by computing machinery, human users may take advantage of it especially for the annotation task. Hence, this work describes an annotation tool for web editing, audio and video annotation in a web front end User Interface powered on the top of a distributed ontology. Furthermore, this dissertation details a thorough comparison of the state of the art of named entity technologies. The NERD framework is presented as technology to encompass existing solutions in the named entity extraction field and the NERD ontology is presented as reference ontology in the field. Finally, this work highlights three use cases with the purpose to reduce the amount of data silos spread within the Web: a Linked Data approach to augment the automatic classification task in a Systematic Literature Review, an application to lift educational data stored in Sharable Content Object Reference Model (SCORM) data silos to the Web of data and a scientific conference venue enhancer plug on the top of several data live collectors. Significant research efforts have been devoted to combine the efficiency of a reliable data structure and the importance of data extraction techniques. This dissertation opens different research doors which mainly join two different research communities: the Semantic Web and the Natural Language Processing community. The Web provides a considerable amount of data where NLP techniques may shed the light within it. The use of the URI as a unique identifier may provide one milestone for the materialization of entities lifted from a raw text to real world object

    Reference Model and Architecture for the Post-Platform Economy

    Get PDF
    The primary goal of this thesis is to propose a reference model and an accompanying software system architecture, which together can serve as a guiding framework for the analysis, design, and implementation of distributed market spaces. The benefit of such a framework is considered two-fold: On the one hand, it provides insights essential for understanding various aspects and elements of self-organized and strictly decentralized online structures to facilitate the emergence of the post-platform economy. On the other hand, it serves as a blueprint for designing and implementing a distributed marketplace instance for a specific application context. It thus allows consumers and providers to set up and expand market spaces themselves, in which they can engage directly and reliably with complex product scenarios

    Expertise-based peer selection in Peer-to-Peer networks

    Get PDF
    Peer-to-Peer systems have proven to be an effective way of sharing data. Modern protocols are able to efficiently route a message to a given peer. However, determining the destination peer in the first place is not always trivial. We propose a a message to a given peer. However, determining the destination peer in the first place is not always trivial. We propose a model in which peers advertise their expertise in the Peer-to-Peer network. The knowledge about the expertise of other peers forms a semantic topology. Based on the semantic similarity between the subject of a query and the expertise of other peers, a peer can select appropriate peers to forward queries to, instead of broadcasting the query or sending it to a random set of peers. To calculate our semantic similarity measure, we make the simplifying assumption that the peers share the same ontology. We evaluate the model in a bibliographic scenario, where peers share bibliographic descriptions of publications among each other. In simulation experiments complemented with a real-world field experiment, we show how expertise-based peer selection improves the performance of a Peer-to-Peer system with respect to precision, recall and the number of messages

    A web-based approach to engineering adaptive collaborative applications

    Get PDF
    Current methods employed to develop collaborative applications have to make decisions and speculate about the environment in which the application will operate within, the network infrastructure that will be used and the device type the application will operate on. These decisions and assumptions about the environment in which collaborative applications were designed to work are not ideal. These methods produce collaborative applications that are characterised as being inflexible, working on homogeneous networks and single platforms, requiring pre-existing knowledge of the data and information types they need to use and having a rigid choice of architecture. On the other hand, future collaborative applications are required to be flexible; to work in highly heterogeneous environments; be adaptable to work on different networks and on a range of device types. This research investigates the role that the Web and its various pervasive technologies along with a component-based Grid middleware can play to address these concerns. The aim is to develop an approach to building adaptive collaborative applications that can operate on heterogeneous and changing environments. This work proposes a four-layer model that developers can use to build adaptive collaborative applications. The four-layer model is populated with Web technologies such as Scalable Vector Graphics (SVG), the Resource Description Framework (RDF), Protocol and RDF Query Language (SPARQL) and Gridkit, a middleware infrastructure, based on the Open Overlays concept. The Middleware layer (the first layer of the four-layer model) addresses network and operating system heterogeneity, the Group Communication layer enables collaboration and data sharing, while the Knowledge Representation layer proposes an interoperable RDF data modelling language and a flexible storage facility with an adaptive architecture for heterogeneous data storage. And finally there is the Presentation and Interaction layer which proposes a framework (Oea) for scalable and adaptive user interfaces. The four layer model has been successfully used to build a collaborative application, called Wildfurt that overcomes challenges facing collaborative applications. This research has demonstrated new applications for cutting-edge Web technologies in the area of building collaborative applications. SVG has been used for developing superior adaptive and scalable user interfaces that can operate on different device types. RDF and RDFS, have also been used to design and model collaborative applications providing a mechanism to define classes and properties and the relationships between them. A flexible and adaptable storage facility that is able to change its architecture based on the surrounding environments and requirements has also been achieved by combining the RDF technology with the Open Overlays middleware, Gridkit

    A semantic approach for scalable and self-organized context-aware systems

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Ontology engineering and routing in distributed knowledge management applications

    Get PDF
    • ā€¦
    corecore