72 research outputs found

    Integrating digital document acquisition into a university library : A case study of social and organizational challenges

    Get PDF
    In this article we report on the effort of the university library of the Vienna University of Economics and Business Administration to integrate a digital library component for research documents authored at the university into the existing library infrastructure. Setting up a digital library has become a relatively easy task using the current data base technology and the components and tools freely available. However, to integrate such a digital library into existing library systems and to adapt existing document acquisition work-flows in the organization are non-trivial tasks. We use a research frame work to identify the key players in this change process and to analyze their incentive structures. Then we describe the light-weight integration approach employed by our university and show how it provides incentives to the key players and at the same time requires only minimal adaptation of the organization in terms of changing existing work-flows. Our experience suggests that this light-weight integration offers a cost efficient and low risk intermediate step towards switching to exclusive digital document acquisition

    rEMM: Extensible Markov Model for Data Stream Clustering in R

    Get PDF
    Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of data points which is used by clustering, but also by a temporal component. The extensible Markov model (EMM) adds the temporal component to data stream clustering by superimposing a dynamically adapting Markov chain. In this paper we introduce the implementation of the R extension package rEMM which implements EMM and we discuss some examples and applications.

    arules - A Computational Environment for Mining Association Rules and Frequent Item Sets

    Get PDF
    Mining frequent itemsets and association rules is a popular and well researched approach for discovering interesting relationships between variables in large databases. The R package arules presented in this paper provides a basic infrastructure for creating and manipulating input data sets and for analyzing the resulting itemsets and rules. The package also includes interfaces to two fast mining algorithms, the popular C implementations of Apriori and Eclat by Christian Borgelt. These algorithms can be used to mine frequent itemsets, maximal frequent itemsets, closed frequent itemsets and association rules.

    Visualizing association rules in hierarchical groups

    Get PDF
    Association rule mining is one of the most popular data mining methods. However, mining association rules often results in a very large number of found rules, leaving the analyst with the task to go through all the rules and discover interesting ones. Sifting manually through large sets of rules is time consuming and strenuous. Although visualization has a long history of making large amounts of data better accessible using techniques like selecting and zooming, most association rule visualization techniques are still falling short when it comes to large numbers of rules. In this paper we introduce a new interactive visualization method, the grouped matrix representation, which allows to intuitively explore and interpret highly complex scenarios. We demonstrate how the method can be used to analyze large sets of association rules using the R software for statistical computing, and provide examples from the implementation in the R-package arulesViz. (authors' abstract

    rEMM: Extensible Markov Model for Data Stream Clustering in R

    Get PDF
    Clustering streams of continuously arriving data has become an important application of data mining in recent years and efficient algorithms have been proposed by several researchers. However, clustering alone neglects the fact that data in a data stream is not only characterized by the proximity of data points which is used by clustering, but also by a temporal component. The extensible Markov model (EMM) adds the temporal component to data stream clustering by superimposing a dynamically adapting Markov chain. In this paper we introduce the implementation of the <b>R</b> extension package <b>rEMM</b> which implements EMM and we discuss some examples and applications
    • …
    corecore