Search CORE

72 research outputs found

Integrating digital document acquisition into a university library : A case study of social and organizational challenges

Author: Hahsler Michael
Publication venue
Publication date: 01/01/2003
Field of study

In this article we report on the effort of the university library of the Vienna University of Economics and Business Administration to integrate a digital library component for research documents authored at the university into the existing library infrastructure. Setting up a digital library has become a relatively easy task using the current data base technology and the components and tools freely available. However, to integrate such a digital library into existing library systems and to adapt existing document acquisition work-flows in the organization are non-trivial tasks. We use a research frame work to identify the key players in this change process and to analyze their incentive structures. Then we describe the light-weight integration approach employed by our university and show how it provides incentives to the key players and at the same time requires only minimal adaptation of the organization in terms of changing existing work-flows. Our experience suggests that this light-weight integration offers a cost efficient and low risk intermediate step towards switching to exclusive digital document acquisition

E-LIS

rEMM: Extensible Markov Model for Data Stream Clustering in R

Author: Margaret H. Dunham
Michael Hahsler
Publication venue
Publication date
Field of study

Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of data points which is used by clustering, but also by a temporal component. The extensible Markov model (EMM) adds the temporal component to data stream clustering by superimposing a dynamically adapting Markov chain. In this paper we introduce the implementation of the R extension package rEMM which implements EMM and we discuss some examples and applications.

Research Papers in Economics

arules - A Computational Environment for Mining Association Rules and Frequent Item Sets

Author: Bettina Grün
Kurt Hornik
Michael Hahsler
Publication venue
Publication date
Field of study

Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package also includes interfaces to two fast mining algorithms, the popular C implementations of Apriori and Eclat by Christian Borgelt. These algorithms can be used to mine frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules.

Research Papers in Economics

Visualizing association rules in hierarchical groups

Author: Hahsler Michael
Karpienko Radoslaw
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Association rule mining is one of the most popular data mining methods. However, mining association rules often results in a very large number of found rules, leaving the analyst with the task to go through all the rules and discover interesting ones. Sifting manually through large sets of rules is time consuming and strenuous. Although visualization has a long history of making large amounts of data better accessible using techniques like selecting and zooming, most association rule visualization techniques are still falling short when it comes to large numbers of rules. In this paper we introduce a new interactive visualization method, the grouped matrix representation, which allows to intuitively explore and interpret highly complex scenarios. We demonstrate how the method can be used to analyze large sets of association rules using the R software for statistical computing, and provide examples from the implementation in the R-package arulesViz. (authors' abstract

Springer - Publisher Connector

Elektronische Publikationen der Wirtschaftsuniversität Wien

rEMM: Extensible Markov Model for Data Stream Clustering in R

Author: Margaret H. Dunham
Michael Hahsler
Publication venue: Foundation for Open Access Statistics
Publication date: 16/07/2010
Field of study

Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of data points which is used by clustering, but also by a temporal component. The extensible Markov model (EMM) adds the temporal component to data stream clustering by superimposing a dynamically adapting Markov chain. In this paper we introduce the implementation of the <b>R</b> extension package <b>rEMM</b> which implements EMM and we discuss some examples and applications

CiteSeerX

Directory of Open Access Journals

Journal of Statistical Software