1 research outputs found

    Using Probabilistic Relational Models to Generate Synthetic Spatial or Non-spatial Databases

    No full text
    International audienceWhen real datasets are difficult to obtain for tasks such as system analysis, or algorithm evaluation, synthetic datasets are commonly used. Techniques for generating such datasets often generate random data for single-table datasets. Such datasets are often inapplicable when it comes to evaluating data mining or machine learning algorithms dealing with relational data. To address this, our earlier works have dealt with the task of generating relational datasets from Probabilistic Relational Models (PRMs), a framework for dealing with prob-abilistic uncertainties in relational domains. In this article, we extend this work by proposing to use more efficient data sampling algorithms, and by using a spatial extension of PRMs to generate synthetic spatial datasets. We also present our experimental analysis on three different data sampling algorithms applicable in our method, and the quality of the datasets generated by them
    corecore