1 research outputs found

    Regression on Evolving Multi-Relational Data Streams

    No full text
    In the last decade, researchers have recognized the need of an increased attention to a type of knowledge discovery applications where the data analyzed is not finite, but streams into the system continuously and endlessly. Data streams are ubiquitous, entering almost every area of modern life. As a result, processing, managing and learning from multiple data streams have become important and challenging tasks for the data mining, database and machine learning communities. Although a substantial body of algorithms for processing and learning from data streams has been developed, most of the work is focused on one-dimensional numerical data streams (time series) or a single multi-dimensional data stream. Only few of the existing solutions consider the most realistic scenario where data can be incomplete, correlated with other streams of information and can arrive from multiple heterogeneous sources. This paper discusses the requirements and the difficulties for learning from multiple multi-dimensional data streams interlinked according to a pre-defined semantic schema (multirelational data streams). The main research problem is to develop a time-efficient, resource-aware methodology for linking and exploring the information which is arriving independently and in an asynchronous way from its respective sources. The resulting framework has to enable, at any time error-bounded approximate answers to aggregate queries commonly issued in the process of multi-relational data mining. In particular we focus on the task of learning regression trees and their variants (model trees, option trees, multi-target trees) from multiple correlated streaming sources. To the best of our knowledge, no other work has previously addressed the problem of learning regression trees from multi-relational data streams
    corecore