Search CORE

570 research outputs found

Coreference detection in XML metadata

Author: De Tré Guy
Szymczak Marcin
Zadrozny Slawomir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for determining the coreference of different XML schemas

Ghent University Academic Bibliography

Automated cleansing of POI databases

Author: A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
A. Bronselaer
C. Baral
C. Baral
D. Dubois
G. Bordogna
G. Cooman De
G. Nachouki
G. Tré De
H. Foley
I. Bloch
I. Fellegi
J. Dujmović
J. Lin
J. Lin
L.A. Zadeh
L.A. Zadeh
M. Bright
M.A. Rodríguez
P. Carrara
R. Torres
R. Yager
R. Yager
R. Yager
R.W. Sinnott
S. Destercke
S. Konieczny
S. Rahimi
S. Sandri
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Crossref

Ghent University Academic Bibliography

Information Integration - the process of integration, evolution and versioning

Author: Keijzer Ander de
Keulen Maurice van
Publication venue: University of Twente, Centre for Telematica and Information Technology (CTIT)
Publication date: 01/01/2005
Field of study

At present, many information sources are available wherever you are. Most of the time, the information needed is spread across several of those information sources. Gathering this information is a tedious and time consuming job. Automating this process would assist the user in its task. Integration of the information sources provides a global information source with all information needed present. All of these information sources also change over time. With each change of the information source, the schema of this source can be changed as well. The data contained in the information source, however, cannot be changed every time, due to the huge amount of data that would have to be converted in order to conform to the most recent schema.\ud In this report we describe the current methods to information integration, evolution and versioning. We distinguish between integration of schemas and integration of the actual data. We also show some key issues when integrating XML data sources

University of Twente Research Information

Development of fuzzy syllogistic algorithms and applications distributed reasoning approaches

Author: Çakır Hüseyin
Publication venue: Izmir Institute of Technology
Publication date: 01/01/2010
Field of study

Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2010Includes bibliographical references (leaves: 44-45)Text in English; Abstract: Turkish and Englishx, 65 leavesA syllogism, also known as a rule of inference or logical appeals, is a formal logical scheme used to draw a conclusion from a set of premises. It is a form of deductive reasoning that conclusion inferred from the stated premises. The syllogistic system consists of systematically combined premises and conclusions to so called figures and moods. The syllogistic system is a theory for reasoning, developed by Aristotle, who is known as one of the most important contributors of the western thought and logic. Since Aristotle, philosophers and sociologists have successfully modelled human thought and reasoning with syllogistic structures. However, a major lack was that the mathematical properties of the whole syllogistic system could not be fully revealed by now. To be able to calculate any syllogistic property exactly, by using a single algorithm, could indeed facilitate modelling possibly any sort of consistent, inconsistent or approximate human reasoning. In this work generic fuzzifications of sample invalid syllogisms and formal proofs of their validity with set theoretic representations are presented. Furthermore, the study discuss the mapping of sample real-world statements onto those syllogisms and some relevant statistics about the results gained from the algorithm applied onto syllogisms. By using this syllogistic framework, it can be used in various fields that can uses syllogisms as inference mechanisms such as semantic web, object oriented programming and data mining reasoning processes

A preference meta-model for logic programs with possibilistic ordered disjunction

Author: Confalonieri Roberto
Nieves Sánchez Juan Carlos
Vázquez Salceda Javier
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2009
Field of study

This paper presents an approach for specifying user preferences related to services by means of a preference meta-model, which is mapped to logic programs with possibilistic ordered disjunction following a Model-Driven Methodology (MDM). MDM allows to specify problem domains by means of meta-models which can be converted to instance models or other meta-models through transformation functions. In particular we propose a preference meta-model that defines an abstract preference specification language allowing users to specify preferences in a more friendly way using models. We also present a meta-model for logic programs with possibilistic order disjunction. Then we show how we conceptually map the preference meta-model to logic programs with possibilistic ordered disjunction by means of a mapping function.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Coreference detection of low quality objects

Author: A. Bronselaer
D. Gusfield
I. Fellegi
P. Lehti
P. Ravikumar
S. Tejada
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

The problem of record linkage is a widely studied problem that aims to identify coreferent (i.e. duplicate) data in a structured data source. As indicated by Winkler, a solution to the record linkage problem is only possible if the error rate is sufficiently low. In other words, in order to succesfully deduplicate a database, the objects in the database must be of sufficient quality. However, this assumption is not always feasible. In this paper, it is investigated how merging of low quality objects into one high quality object can improve the process of record linkage. This general idea is illustrated in the context of strings comparison, where strings of low quality (i.e. with a high typographical error rate) are merged into a string of high quality by using an n-dimensional Levenshtein distance matrix and compute the optimal alignment between the dirty strings. Results are presented and possible refinements are proposed

Crossref

Ghent University Academic Bibliography

Estimating Similarities in DNA Strings Using the Efficacious Rank Distance Approach

Author: Andrea Sgarro
Liviu P. Dinu
Publication venue: 'IntechOpen'
Publication date: 01/01/2011
Field of study

IntechOpen

Archivio istituzionale della ricerca - Università di Trieste

Crossref

Handling metadata in the scope of coreference detection in data collections

Author: Szymczak Marcin
Publication venue: Polisch Academy of Sciences. Systems Research Institute ; Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2015
Field of study

Ghent University Academic Bibliography