Search CORE

6,339 research outputs found

Coreference detection in XML metadata

Author: De Tré Guy
Szymczak Marcin
Zadrozny Slawomir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

Preserving data quality is an important issue in data collection management. One of the crucial issues hereby is the detection of duplicate objects (called coreferent objects) which describe the same entity, but in different ways. In this paper we present a method for detecting coreferent objects in metadata, in particular in XML schemas. Our approach consists in comparing the paths from a root element to a given element in the schema. Each path precisely defines the context and location of a specific element in the schema. Path matching is based on the comparison of the different steps of which paths are composed. The uncertainty about the matching of steps is expressed with possibilistic truth values and aggregated using the Sugeno integral. The discovered coreference of paths can help for determining the coreference of different XML schemas

Ghent University Academic Bibliography

Spatio-temporal wardrobe generation of actor's clothing in video content

Author: E Simo-Serra
F Wang
H Wang
J Liaukonyte
K Nogueira
K Taşdemir
L Baraldi
L dos Santos Belo
M Ajmal
P Šaloun
R Achanta
SA Chatzichristofis
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Crossref

Ghent University Academic Bibliography

Evaluation campaigns and TRECVid

Author: Kraaij Wessel
Over Paul
Smeaton Alan F.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2006
Field of study

The TREC Video Retrieval Evaluation (TRECVid) is an international benchmarking activity to encourage research in video information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. TRECVid completed its fifth annual cycle at the end of 2005 and in 2006 TRECVid will involve almost 70 research organizations, universities and other consortia. Throughout its existence, TRECVid has benchmarked both interactive and automatic/manual searching for shots from within a video corpus, automatic detection of a variety of semantic and low-level video features, shot boundary detection and the detection of story boundaries in broadcast TV news. This paper will give an introduction to information retrieval (IR) evaluation from both a user and a system perspective, highlighting that system evaluation is by far the most prevalent type of evaluation carried out. We also include a summary of TRECVid as an example of a system evaluation benchmarking campaign and this allows us to discuss whether such campaigns are a good thing or a bad thing. There are arguments for and against these campaigns and we present some of them in the paper concluding that on balance they have had a very positive impact on research progress

CiteSeerX

Crossref

Irish Universities

DCU Online Research Access Service

Measuring the similarity of PML documents with RFID-based sensors

Author: Reza Malekian
Wang Ru-chuan
Wang Zhong-qin
Ye Ning
Zhao Ting-ting
Publication venue
Publication date: 12/09/2013
Field of study

The Electronic Product Code (EPC) Network is an important part of the Internet of Things. The Physical Mark-Up Language (PML) is to represent and de-scribe data related to objects in EPC Network. The PML documents of each component to exchange data in EPC Network system are XML documents based on PML Core schema. For managing theses huge amount of PML documents of tags captured by Radio frequency identification (RFID) readers, it is inevitable to develop the high-performance technol-ogy, such as filtering and integrating these tag data. So in this paper, we propose an approach for meas-uring the similarity of PML documents based on Bayesian Network of several sensors. With respect to the features of PML, while measuring the similarity, we firstly reduce the redundancy data except information of EPC. On the basis of this, the Bayesian Network model derived from the structure of the PML documents being compared is constructed.Comment: International Journal of Ad Hoc and Ubiquitous Computin

arXiv.org e-Print Archive

UPSpace at the University of Pretoria

Pairwise similarity of TopSig document signatures

Author: De Vries Christopher
Geva Shlomo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2012
Field of study

This paper analyses the pairwise distances of signatures produced by the TopSig retrieval model on two document collections. The distribution of the distances are compared to purely random signatures. It explains why TopSig is only competitive with state of the art retrieval models at early precision. Only the local neighbourhood of the signatures is interpretable. We suggest this is a common property of vector space models

Crossref

Queensland University of Technology ePrints Archive

Bayesian Network and Network Pruning Strategy for XML Duplicate Detection

Author: Ms. Trupti Patil, Siddheshwar Patil, Mis
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/11/2014
Field of study

Data Duplication causes excess use of redundant storage, excess time and inconsistency. Duplicate detection will help to ensure accurate data by identifying and preventing identical or similar records. There is a long work on identifying duplicates in relational data, but only a slight solution focused on duplicate detection in more complex hierarchical structures, like XML data. Hierarchical data are defined as a set of data items that are related to each other by hierarchical relationships such as XML .In the world of XML there are not necessarily uniform and clearly defined structures like tables. Duplicate detection has been studied extensively for relational data. Methods devised for duplicate detection in a single relation do not directly apply to XML data. Therefore there is a need to develop a method to detect duplicate objects in nested XML data. In proposed system duplicates are detected by using duplicate detection algorithm called as XMLDup. Proposed XMLDup method will be using Bayesian network. It determine the probability of two XML elements being duplicates by considering the information within the elements and the structure of information. In order to improve the Bayesian Network evaluation time, pruning strategy is used. Finally work will be analyzed by measuring Precision and Recall value

International Journal on Recent and Innovation Trends in Computing and Communication