Planning to Gather Information
AbstractThe exponential growth of the Internet has produced a labyrinth of documents, databases and services. While almost any type of information is available somewhere, even expert users waste time and effort searching for appropriate information sources, and phrasing queries in the custom formats required by each site. To make matters worse, many queries can only be answered by combining information from several different sites. This paper describes Occam, a query planning algorithm that determines the best way to integrate data from different sources. As input, Occam takes a library of site descriptions and a user query. As output, Occam automatically generates one or more plans that encode alternative ways to gather the requested information. Occam has several important features: (1) it integrates both legacy systems and full relational databases with an efficient, domain-independent, query-planning algorithm, (2) it reasons about the capabilities of different information sources, (3) ..