Search CORE

13,612 research outputs found

Duplicate Detection in Probabilistic Data

Author: Keijzer Ander de
Keulen Maurice van
Panse Fabian
Ritter Norbert
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2009
Field of study

Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic databases, an integration of probabilistic data has to be performed. Until now, however, data integration approaches have focused on the integration of certain source data (relational or XML). There is no work on the integration of uncertain (esp. probabilistic) source data so far. In this paper, we present a first step towards a concise consolidation of probabilistic data. We focus on duplicate detection as a representative and essential step in an integration process. We present techniques for identifying multiple probabilistic representations of the same real-world entities. Furthermore, for increasing the efficiency of the duplicate detection process we introduce search space reduction methods adapted to probabilistic data

CiteSeerX

Crossref

University of Twente Research Information

Standardization and application of microsatellite markers for variety identification in tomato and wheat

Author: Bredemeijer G.
Cooke R.
Ganal M.
Isaac P.
Peeters R.
Vosman B.
Publication venue
Publication date: 01/01/2001
Field of study

The present study is part of a EU project that aims to demonstrate the technical viability of STMS markers for variety identification. As examples two important European crop species, tomato and wheat were chosen. Initially, about 30-40 STMS markers were used to identify a set of 20 good markers per crop and to standardise the methodology and the interpretation of the results in different laboratories. Several systems were used for the detection of STMS polymorphisms. The selected STMS markers are being tested on 500 varieties of each species and databases are being constructed. The first comparisons of data generated by the different laboratories revealed a high degree of agreement. The causes of discrepancies between duplicate samples analysed in different laboratories and precautions to prevent them, are discussed

Wageningen University & Research Publications

Ranking News-Quality Multimedia

Author: Arapakis Ioannis
Hasler David
Michael
Tang Z
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 09/10/2018
Field of study

News editors need to find the photos that best illustrate a news piece and fulfill news-media quality standards, while being pressed to also find the most recent photos of live events. Recently, it became common to use social-media content in the context of news media for its unique value in terms of immediacy and quality. Consequently, the amount of images to be considered and filtered through is now too much to be handled by a person. To aid the news editor in this process, we propose a framework designed to deliver high-quality, news-press type photos to the user. The framework, composed of two parts, is based on a ranking algorithm tuned to rank professional media highly and a visual SPAM detection module designed to filter-out low-quality media. The core ranking algorithm is leveraged by aesthetic, social and deep-learning semantic features. Evaluation showed that the proposed framework is effective at finding high-quality photos (true-positive rate) achieving a retrieval MAP of 64.5% and a classification precision of 70%.Comment: To appear in ICMR'1

arXiv.org e-Print Archive

Crossref

Localization Recall Precision (LRP): A New Performance Metric for Object Detection

Author: B Ristic
D Schuhmacher
F Bourgeois
JR Quevedo
K Bernardin
K Oksuz
M Everingham
MD Breitenstein
O Russakovsky
S Stalder
T-Y Lin
W Liu
ZC Lipton
Publication venue
Publication date: 05/07/2018
Field of study

Average precision (AP), the area under the recall-precision (RP) curve, is the standard performance measure for object detection. Despite its wide acceptance, it has a number of shortcomings, the most important of which are (i) the inability to distinguish very different RP curves, and (ii) the lack of directly measuring bounding box localization accuracy. In this paper, we propose 'Localization Recall Precision (LRP) Error', a new metric which we specifically designed for object detection. LRP Error is composed of three components related to localization, false negative (FN) rate and false positive (FP) rate. Based on LRP, we introduce the 'Optimal LRP', the minimum achievable LRP error representing the best achievable configuration of the detector in terms of recall-precision and the tightness of the boxes. In contrast to AP, which considers precisions over the entire recall domain, Optimal LRP determines the 'best' confidence score threshold for a class, which balances the trade-off between localization and recall-precision. In our experiments, we show that, for state-of-the-art object (SOTA) detectors, Optimal LRP provides richer and more discriminative information than AP. We also demonstrate that the best confidence score thresholds vary significantly among classes and detectors. Moreover, we present LRP results of a simple online video object detector which uses a SOTA still image object detector and show that the class-specific optimized thresholds increase the accuracy against the common approach of using a general threshold for all classes. At https://github.com/cancam/LRP we provide the source code that can compute LRP for the PASCAL VOC and MSCOCO datasets. Our source code can easily be adapted to other datasets as well.Comment: to appear in ECCV 201

arXiv.org e-Print Archive

Crossref

OpenMETU (Middle East Technical University)

Maximum Production Of Transmission Messages Rate For Service Discovery Protocols

Author: Al-Mejibli ISJ
Colley MJ
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2011
Field of study

Minimizing the number of dropped User Datagram Protocol (UDP) messages in a network is regarded as a challenge by researchers. This issue represents serious problems for many protocols particularly those that depend on sending messages as part of their strategy, such us service discovery protocols. This paper proposes and evaluates an algorithm to predict the minimum period of time required between two or more consecutive messages and suggests the minimum queue sizes for the routers, to manage the traffic and minimise the number of dropped messages that has been caused by either congestion or queue overflow or both together. The algorithm has been applied to the Universal Plug and Play (UPnP) protocol using ns2 simulator. It was tested when the routers were connected in two configurations; as a centralized and de centralized. The message length and bandwidth of the links among the routers were taken in the consideration. The result shows Better improvement in number of dropped messages `among the routers

University of Essex Research Repository

arXiv.org e-Print Archive

CiteSeerX

An automated wrapper-based approach to the design of dependable software

Author: Jhumka Arshad
Leeke Matthew
Publication venue
Publication date: 01/01/2011
Field of study

The design of dependable software systems invariably comprises two main activities: (i) the design of dependability mechanisms, and (ii) the location of dependability mechanisms. It has been shown that these activities are intrinsically difficult. In this paper we propose an automated wrapper-based methodology to circumvent the problems associated with the design and location of dependability mechanisms. To achieve this we replicate important variables so that they can be used as part of standard, efficient dependability mechanisms. These well-understood mechanisms are then deployed in all relevant locations. To validate the proposed methodology we apply it to three complex software systems, evaluating the dependability enhancement and execution overhead in each case. The results generated demonstrate that the system failure rate of a wrapped software system can be several orders of magnitude lower than that of an unwrapped equivalent

CiteSeerX

Warwick Research Archives Portal Repository