Search CORE

67 research outputs found

Analysis Study of Classification Techniques for Web Services

Author: Duaa Waleed Ata Faroun
دعاء وليد عطا فرعون
Publication venue: جامعة القدس
Publication date
Field of study

Efficient Algorithms for Fast Integration on Large Data Sets from Multiple Sources

Author: Aseltine Robert H.
Mi Tian
Rajasekaran Sanguthevar
Publication venue: OpenCommons@UConn
Publication date: 28/06/2012
Field of study

Background Recent large scale deployments of health information technology have created opportunities for the integration of patient medical records with disparate public health, human service, and educational databases to provide comprehensive information related to health and development. Data integration techniques, which identify records belonging to the same individual that reside in multiple data sets, are essential to these efforts. Several algorithms have been proposed in the literatures that are adept in integrating records from two different datasets. Our algorithms are aimed at integrating multiple (in particular more than two) datasets efficiently. Methods Hierarchical clustering based solutions are used to integrate multiple (in particular more than two) datasets. Edit distance is used as the basic distance calculation, while distance calculation of common input errors is also studied. Several techniques have been applied to improve the algorithms in terms of both time and space: 1) Partial Construction of the Dendrogram (PCD) that ignores the level above the threshold; 2) Ignoring the Dendrogram Structure (IDS); 3) Faster Computation of the Edit Distance (FCED) that predicts the distance with the threshold by upper bounds on edit distance; and 4) A pre-processing blocking phase that limits dynamic computation within each block. Results We have experimentally validated our algorithms on large simulated as well as real data. Accuracy and completeness are defined stringently to show the performance of our algorithms. In addition, we employ a four-category analysis. Comparison with FEBRL shows the robustness of our approach. Conclusions In the experiments we conducted, the accuracy we observed exceeded 90% for the simulated data in most cases. 97.7% and 98.1% accuracy were achieved for the constant and proportional threshold, respectively, in a real dataset of 1,083,878 records

Springer - Publisher Connector

PubMed Central

DigitalCommons@UConn

OpenCommons at University of Connecticut

An Updated Survey on Search-Based Software Design

Author: Räihä Outi
Publication venue
Publication date: 01/01/2009
Field of study

Trepo - Institutional Repository of Tampere University

Cross-language Ontology Learning: Incorporating and Exploiting Cross-language Data in the Ontology Learning Process

Author: Hjelm Hans
Publication venue
Publication date: 01/01/2009
Field of study

Hans Hjelm. Cross-language Ontology Learning: Incorporating and Exploiting Cross-language Data in the Ontology Learning Process. NEALT Monograph Series, Vol. 1 (2009), 159 pages. © 2009 Hans Hjelm. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/10126

Publikationer från Stockholms universitet

Digitala Vetenskapliga Arkivet - Academic Archive On-line

DSpace at Tartu University Library

Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources

Author: Wohlgenannt Gerhard
Publication venue: 'Peter Lang, International Academic Publishers'
Publication date: 01/04/2020
Field of study

The manual construction of formal domain conceptualizations (ontologies) is labor-intensive. Ontology learning, by contrast, provides (semi-)automatic ontology generation from input data such as domain text. This thesis proposes a novel approach for learning labels of non-taxonomic ontology relations. It combines corpus-based techniques with reasoning on Semantic Web data. Corpus-based methods apply vector space similarity of verbs co-occurring with labeled and unlabeled relations to calculate relation label suggestions from a set of candidates. A meta ontology in combination with Semantic Web sources such as DBpedia and OpenCyc allows reasoning to improve the suggested labels. An extensive formal evaluation demonstrates the superior accuracy of the presented hybrid approach

Directory of Open Access Books (DOAB)

Place Recognition for Mobile Robot in Changing Environments

Author: Cao Juan
Publication venue
Publication date: 01/01/2016
Field of study

Aberystwyth Research Portal