Search CORE

4 research outputs found

Probabilistic Query Models for Transaction Data

Author: Dmitry Pavlov
Padhraic Smyth
Publication venue: ACM Press
Publication date: 01/01/2001
Field of study

We investigate the application of Bayesian networks, Markov random fields, and mixture models to the problem of query answering for transaction data sets. We formulate two versions of the querying problem: the query selectivity problem (i.e., finding exact counts for tuples in a database) and the query generalization problem (i.e., computing the probability that a tuple will occur in new data). We show that frequent itemsets are useful for reducing the original data to a compressed representation and introduce a way to store them using an ADTrees data structure. In an extension of our earlier work on this topic we propose several new schemes for query answering based on the compressed representation, that avoid direct scans of the data at query time. Experimental results on real-world transaction data sets provide insights into various tradeoffs involving offline time model-building, online time for query-answering, memory footprint of the compressed data, and the accuracy of..

CiteSeerX

Crossref

Probabilistic Query Models for Transaction Data

Author: Dmitry Pavlov
Dmitry Pavlov University
Padhraic Smyth
Publication venue
Publication date: 01/01/2001
Field of study

We investigate the application of Bayesian networks, Markov random fields, and mixture models to the problem of query answering for transaction data sets. We formulate two versions of the querying problem: the query selectivity estimation (i.e., finding exact counts for tuples in a data set) and the query generalization problem (i.e., computing the probability that a tuple will occur in new data). We show that frequent itemsets are useful for reducing the original data to a compressed representation and introduce a method to store them using an ADTree data structure. In an extension of our earlier work on this topic we propose several new schemes for query answering based on the compressed representation that avoid direct scans of the data at query time. Experimental results on real-world transaction data sets provide insights into various tradeoffs involving the offline time for model-building, the online time for query-answering, the memory footprint of the compressed data, and the accuracy of the estimate provided to the query

CiteSeerX

ABSTRACT Probabilistic Query Models for Transaction Data

Author: Dmitry Pavlov
Publication venue
Publication date
Field of study

pavlovd @ ics.uci.ed

CiteSeerX