55,814 research outputs found
Set-Oriented Mining for Association Rules in Relational Databases
Describe set-oriented algorithms for mining association rules. Such algorithms imply performing multiple joins and may appear to be inherently less efficient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss the optimization of these algorithms. After analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. SETM uses only simple database primitives, viz. sorting and merge-scan join. SETM is simple, fast and stable over the range of parameter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms. The set-oriented nature of SETM facilitates the development of extension
Market basket analysis of library circulation data
“Market Basket Analysis” algorithms have recently seen widespread use in analyzing consumer purchasing patterns-specifically, in detecting products that are frequently purchased together. We apply the Apriori market basket analysis tool to the task of detecting subject classification categories that co-occur in transaction records of book borrowed form a university library. This information can be useful in directing users to additional portions of the collection that may contain documents relevant to their information need, and in determining a library’s physical layout. These results can also provide insight into the degree of “scatter” that the classification scheme induces in a particular collection of documents
Attribute oriented induction with star schema
This paper will propose a novel star schema attribute induction as a new
attribute induction paradigm and as improving from current attribute oriented
induction. A novel star schema attribute induction will be examined with
current attribute oriented induction based on characteristic rule and using non
rule based concept hierarchy by implementing both of approaches. In novel star
schema attribute induction some improvements have been implemented like
elimination threshold number as maximum tuples control for generalization
result, there is no ANY as the most general concept, replacement the role
concept hierarchy with concept tree, simplification for the generalization
strategy steps and elimination attribute oriented induction algorithm. Novel
star schema attribute induction is more powerful than the current attribute
oriented induction since can produce small number final generalization tuples
and there is no ANY in the results.Comment: 23 Pages, IJDM
- …