Massive Data Exploration using Estimated Cardinalities

Abstract

International audienceLinguistic summaries are used in this work to provide personalized exploration functionalities on massive relational data. To ensure a fluid exploration of the data, cardinalities of the data properties described in the summaries are estimated from statistics about the data distribution. The proposed workflow also involves a vocabulary inference mechanism from these statistics and a sampling-based approach to consolidate the estimated cardinalities. The paper shows that soft computing techniques are particularly relevant to build concrete and functional business intelligence solutions

    Similar works