Extraction of Informations From Highly Heterogeneous Source of Textual Data
- Publication date
- 1997
- Publisher
Abstract
. Extracting informations from multiple sources, highly heterogeneous, of textual data and integrating them in order to provide true information is a challenging research topic in the database area. In order to illustrate problems and solutions, one of the most interesting projects facing this problem, TSIMMIS, is presented. Furthermore, a Description Logics approach, able to provide interesting solutions both for data integration and data querying, is introduced. 1 Introduction The availiability of large numbers of network informations sources (and the recent explosion of Internet) makes it possible to access to a very large amount of information sources all over the world. The increased amount of available informations has as a consequence the fact that, for a given query, the set of potentially interesting sites is very high but only very few sites are really relevant. Furthermore, informations are highly heterogeneous both in their structure and in their origin. In particular, n..