27 research outputs found

    Progressive Optimization in Action

    Get PDF

    Information integration:

    No full text
    The theme for this special issue—information integration—reflects the growing importance of integration in general, and data integration in particular, as a driving force in information technology spending. This essay discusses information integration along three axes—data types, federation, and intelligence. Several important problem areas are emerging—storage and retrieval of XML (Extensible Markup Language) documents, federation and distribution across data sources, and holistic intelligence across different data modalities. This special issue is devoted to papers on many of these topics, and we expect this to be an active area of research for many years to come. Integration is the driving force of this decade of IT (information technology) spending. As enterprises buy more and more packaged applications, it is estimated that the task of combining these application “silos ” results in over 40 percent of the IT spending, even though the amount of code written for integration is significantly smaller than 40 percent. This is because integration projects tend to be one-of-akind, and complex to write. The question for software and services vendors is this: can the cost of integration be reduced to be more in line with that of packaged applications? The essay is organized as follows. This section describes four integration models. The next section gives an overview of information integration. Following sections then explore some of the technical challenges along the three axes that are the basis fo

    Abstract Extensible Query Processing in Starburst

    No full text
    Today’s DBMSs are unable to support the increasing de-mands of the various applications that would like to use a DBMS. Each kind of application poses new requirements for the DBMS. The Starburst project at IBM’s Almaden Research Center aims to extend relational DBMS technology to bridge this gap between applications and the DBMS. While providing a full function relational system to enable sharing across applications, Starburst will also allow (sophis-ticated) programmers to add many kinds of extensions to the base system’s capabilities, including language extensions (e.g., new datatypes and operations), data management ex-tensions (e.g., new access and storage methods) and internal processing extensions (e.g., new join methods and new query transformations). To support these features, the database query language processor must be very powerful and highly extensible. Starburst’s language processor features a powerful query language, rule-based optimization and query rewrite, and an execution system based on an extended relational algebra. In this paper, we describe the design of Starburst’s query language processor and discuss the ways in which the language processor can be extended to achieve Starburst’s goals. 1

    Performance Analysis of a Parallel Sort Merge Join on Cluster Architectures

    No full text

    Executing SPARQL queries over the web of linked data

    Full text link
    Abstract. The Web of Linked Data forms a single, globally distributed dataspace. Due to the openness of this dataspace, it is not possible to know in advance all data sources that might be relevant for query answering. This openness poses a new challenge that is not addressed by traditional research on federated query processing. In this paper we present an approach to execute SPARQL queries over the Web of Linked Data. The main idea of our approach is to discover data that might be relevant for answering a query during the query execution itself. This discovery is driven by following RDF links between data sources based on URIs in the query and in partial results. The URIs are resolved over the HTTP protocol into RDF data which is continuously added to the queried dataset. This paper describes concepts and algorithms to implement our approach using an iterator-based pipeline. We introduce a formalization of the pipelining approach and show that classical iterators may cause blocking due to the latency of HTTP requests. To avoid blocking, we propose an extension of the iterator paradigm. The evaluation of our approach shows its strengths as well as the still existing challenges.

    The SPARQL Query Graph Model for Query Optimization

    No full text
    corecore