570 research outputs found

    Coreference detection in XML metadata

    Get PDF
    Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for determining the coreference of different XML schemas

    Information Integration - the process of integration, evolution and versioning

    Get PDF
    At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

    Development of fuzzy syllogistic algorithms and applications distributed reasoning approaches

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2010Includes bibliographical references (leaves: 44-45)Text in English; Abstract: Turkish and Englishx, 65 leavesA syllogism, also known as a rule of inference or logical appeals, is a formal logical scheme used to draw a conclusion from a set of premises. It is a form of deductive reasoning that conclusion inferred from the stated premises. The syllogistic system consists of systematically combined premises and conclusions to so called figures and moods. The syllogistic system is a theory for reasoning, developed by Aristotle, who is known as one of the most important contributors of the western thought and logic. Since Aristotle, philosophers and sociologists have successfully modelled human thought and reasoning with syllogistic structures. However, a major lack was that the mathematical properties of the whole syllogistic system could not be fully revealed by now. To be able to calculate any syllogistic property exactly, by using a single algorithm, could indeed facilitate modelling possibly any sort of consistent, inconsistent or approximate human reasoning. In this work generic fuzzifications of sample invalid syllogisms and formal proofs of their validity with set theoretic representations are presented. Furthermore, the study discuss the mapping of sample real-world statements onto those syllogisms and some relevant statistics about the results gained from the algorithm applied onto syllogisms. By using this syllogistic framework, it can be used in various fields that can uses syllogisms as inference mechanisms such as semantic web, object oriented programming and data mining reasoning processes

    A preference meta-model for logic programs with possibilistic ordered disjunction

    Get PDF
    This paper presents an approach for specifying user preferences related to services by means of a preference meta-model, which is mapped to logic programs with possibilistic ordered disjunction following a Model-Driven Methodology (MDM). MDM allows to specify problem domains by means of meta-models which can be converted to instance models or other meta-models through transformation functions. In particular we propose a preference meta-model that defines an abstract preference specification language allowing users to specify preferences in a more friendly way using models. We also present a meta-model for logic programs with possibilistic order disjunction. Then we show how we conceptually map the preference meta-model to logic programs with possibilistic ordered disjunction by means of a mapping function.Peer ReviewedPostprint (published version

    Coreference detection of low quality objects

    Get PDF
    The problem of record linkage is a widely studied problem that aims to identify coreferent (i.e. duplicate) data in a structured data source. As indicated by Winkler, a solution to the record linkage problem is only possible if the error rate is sufficiently low. In other words, in order to succesfully deduplicate a database, the objects in the database must be of sufficient quality. However, this assumption is not always feasible. In this paper, it is investigated how merging of low quality objects into one high quality object can improve the process of record linkage. This general idea is illustrated in the context of strings comparison, where strings of low quality (i.e. with a high typographical error rate) are merged into a string of high quality by using an n-dimensional Levenshtein distance matrix and compute the optimal alignment between the dirty strings. Results are presented and possible refinements are proposed

    Handling metadata in the scope of coreference detection in data collections

    Get PDF
    • …
    corecore