Location of Repository

Submitted to ICDE 2005 THALIA: Test Harness for the Assessment of Legacy Information Integration Approaches

By Joachim Hammer, Mike Stonebraker and Oguzhan Topsakal


We introduce a new, publicly available testbed and benchmark called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) to simplify the evaluation of existing integration technologies and to enable more thorough and more focused testing. THALIA provides researchers with a collection of downloadable data sources representing University course catalogs, a set of twelve benchmark queries, as well as a scoring function for ranking the performance of an integration system. Our benchmark focuses on syntactic and semantic heterogeneities since we believe they still pose the greatest technical challenges. A second important contribution of this paper is a systematic classification of the different types of syntactic and semantic heterogeneities, which directly lead to the twelve queries that make up the benchmark. A sample evaluation of two integration systems at the end of the paper is intended to show the usefulness of THALIA in identifying the problem areas that need the greatest attention from our research community if we want to improve the usefulness of today’s integration systems. 1

Year: 2011
OAI identifier: oai:CiteSeerX.psu:
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.cise.ufl.edu/resear... (external link)
  • Suggested articles

    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.