Ratio Rule Mining from Multiple Data Sources

Abstract

Abstract. Both multiple source data mining and streaming data mining problems have attracted much attention in the past decade. In contrast to traditional association-rule mining, to capture the quantitative association knowledge, a new paradigm called Ratio Rule (RR) was proposed recently. We extend this framework to mining ratio rules from multiple source data streams which is a novel and challenging problem. The traditional techniques used for ratio rule mining is an eigen-system analysis which can often fall victim to noises. The multiple data sources impose additional constraints for the mining procedure to be robust in the presence of noise, because it is difficult to clean all the data sources in real time in real-world tasks. In addition, the traditional batch methods for ratio rules cannot cope with data streams. In this paper, we propose an integrated method to mining ratio rules from data streams from multiple data sources, by first mining the ratio rules from each data source respectively through a novel robust and adaptive one-pass algorithm (which is called Robust and Adaptive Ratio Rule (RARR)), and then integrating the rules of each data source in a simple probabilistic model with a rule-clustering procedure. In this way, we can acquire the global rules from all the local information sources incrementally. We show that the RARR can converge to a fixed point and i

    Similar works

    Full text

    thumbnail-image

    Available Versions