738 research outputs found

    A Database Approach to Content-based XML retrieval

    Get PDF
    This paper describes a rst prototype system for content-based retrieval from XML data. The system's design supports both XPath queries and complex information retrieval queries based on a language modelling approach to information retrieval. Evaluation using the INEX benchmark shows that it is beneficial if the system is biased to retrieve large XML fragments over small fragments

    Information Integration - the process of integration, evolution and versioning

    Get PDF
    At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

    Interval Neutrosophic Sets and Logic: Theory and Applications in Computing

    Get PDF
    A neutrosophic set is a part of neutrosophy that studies the origin, nature, and scope of neutralities, as well as their interactions with different ideational spectra. The neutrosophic set is a powerful general formal framework that has been recently proposed. However, the neutrosophic set needs to be specified from a technical point of view. Here, we define the set-theoretic operators on an instance of a neutrosophic set, and call it an Interval Neutrosophic Set (INS). We prove various properties of INS, which are connected to operations and relations over INS. We also introduce a new logic system based on interval neutrosophic sets. We study the interval neutrosophic propositional calculus and interval neutrosophic predicate calculus. We also create a neutrosophic logic inference system based on interval neutrosophic logic. Under the framework of the interval neutrosophic set, we propose a data model based on the special case of the interval neutrosophic sets called Neutrosophic Data Model. This data model is the extension of fuzzy data model and paraconsistent data model. We generalize the set-theoretic operators and relation-theoretic operators of fuzzy relations and paraconsistent relations to neutrosophic relations. We propose the generalized SQL query constructs and tuple-relational calculus for Neutrosophic Data Model. We also design an architecture of Semantic Web Services agent based on the interval neutrosophic logic and do the simulation study

    Intuitionistic fuzzy XML query matching and rewriting

    Get PDF
    With the emergence of XML as a standard for data representation, particularly on the web, the need for intelligent query languages that can operate on XML documents with structural heterogeneity has recently gained a lot of popularity. Traditional Information Retrieval and Database approaches have limitations when dealing with such scenarios. Therefore, fuzzy (flexible) approaches have become the predominant. In this thesis, we propose a new approach for approximate XML query matching and rewriting which aims at achieving soft matching of XML queries with XML data sources following different schemas. Unlike traditional querying approaches, which require exact matching, the proposed approach makes use of Intuitionistic Fuzzy Trees to achieve approximate (soft) query matching. Through this new approach, not only the exact answer of a query, but also approximate answers are retrieved. Furthermore, partial results can be obtained from multiple data sources and merged together to produce a single answer to a query. The proposed approach introduced a new tree similarity measure that considers the minimum and maximum degrees of similarity/inclusion of trees that are based on arc matching. New techniques for soft node and arc matching were presented for matching queries against data sources with highly varied structures. A prototype was developed to test the proposed ideas and it proved the ability to achieve approximate matching for pattern queries with a number of XML schemas and rewrite the original query so that it obtain results from the underlying data sources. This has been achieved through several novel algorithms which were tested and proved efficiency and low CPU/Memory cost even for big number of data sources

    Ontology-based knowledge representation and semantic search information retrieval: case study of the underutilized crops domain

    Get PDF
    The aim of using semantic technologies in domain knowledge modeling is to introduce the semantic meaning of concepts in knowledge bases, such that they are both human-readable as well as machine-understandable. Due to their powerful knowledge representation formalism and associated inference mechanisms, ontology-based approaches have been increasingly adopted to formally represent domain knowledge. The primary objective of this thesis work has been to use semantic technologies in advancing knowledge-sharing of Underutilized crops as a domain and investigate the integration of underlying ontologies developed in OWL (Web Ontology Language) with augmented SWRL (Semantic Web Rule Language) rules for added expressiveness. The work further investigated generating ontologies from existing data sources and proposed the reverse-engineering approach of generating domain specific conceptualization through competency questions posed from possible ontology users and domain experts. For utilization, a semantic search engine (the Onto-CropBase) has been developed to serve as a Web-based access point for the Underutilized crops ontology model. Relevant linked-data in Resource Description Framework Schema (RDFS) were added for comprehensiveness in generating federated queries. While the OWL/SWRL combination offers a highly expressive ontology language for modeling knowledge domains, the combination is found to be lacking supplementary descriptive constructs to model complex real-life scenarios, a necessary requirement for a successful Semantic Web application. To this end, the common logic programming formalisms for extending Description Logic (DL)-based ontologies were explored and the state of the art in SWRL expressiveness extensions determined with a view to extending the SWRL formalism. Subsequently, a novel fuzzy temporal extension to the Semantic Web Rule Language (FT-SWRL), which combines SWRL with fuzzy logic theories based on the valid-time temporal model, has been proposed to allow modeling imprecise temporal expressions in domain ontologies

    Warehousing and Analyzing Streaming Data Quality Information

    Get PDF
    The development of integrative IS architectures focuses typically on solving problems related to the functionality of the system. It is attempted to design optimally flexible interfaces that can achieve the most agile architecture. The quality of the data that will be exchanged across these interfaces is often disregarded (implicitly or explicitly). This results in distributed applications which are functionally correct but cannot be deployed due to the low quality of the data involved. In order to avoid wrong business decisions due to ‘dirty data’, quality characteristics have to be captured, processed, and provided to the respective business task. However, the issue of how to efficiently provide applications with information about data quality is still an open research problem. Our approach tackles the problems posed by data quality deficiencies by presenting a novel concept to stream and warehouse data together with its describing data quality information
    • …
    corecore