Efficient Self-Join Algorithm in Interval-based Temporal Data Models

Abstract

Interval-based temporal data model is a popular data model in temporal databases. It uses time intervals for representing the period of validity of a tuple, leading to unavoidable self-joins when combining tuples for objects. It requires k+1-way self-join for k conjunctive conditions. Join operations are one of the most expensive operations in databases and they are even more serious in temporal databases because of growing data. There are many join algorithms for temporal databases. However, they focus on joining different inputs rather than an identical input, leading to multiple scans for the identical input. Advanced 2-way join algorithms avoid a quadratic disk I/O complexity, but they are affected by the number of self-joins and partition sizes. In this paper, we address the problem of self-joins in the interval-based temporal data model and introduce a stream-based self-join algorithm. The proposed algorithm shows that it achieves a single relation scan for k-way self-join and its performance is not affected by partition sizes

    Similar works