1 research outputs found
High-utility itemset mining for subadditive monotone utility functions
High-utility Itemset Mining (HUIM) finds itemsets from a transaction database
with utility no less than a user-defined threshold where the utility of an
itemset is defined as the sum of the item-wise utilities. In this paper, we
generalize this notion to utility functions that need not be a simple sum of
individual utilities. In particular, we study generalized utility functions
that are subadditive and monotone (SM). We also describe a novel function that
allows us to include external information in the form of a relationship graph
for computing utility. Next, we focus on algorithms for HUIM problems with SM
utility functions. We note that the existing HUIM algorithms use upper-bounds
like "Transaction Weighted Utility" and "Exact-Utility, Remaining-Utility" for
efficient search-space exploration. We derive analogous and tighter
upper-bounds for SM utility functions. We design a novel inverted-list data
structure called SMI-list and a new algorithm called SM-Miner to mine HUIs for
SM functions. We explain how existing tree-based and projection-based HUIM
algorithms can be adapted using these bounds. We experimentally compare
adaptations of some of the latest HUIM algorithms and point out some caveats
that should be kept in mind while handling utility functions that allow
integration of domain knowledge with a transaction database.Comment: Pre-print of our pape