64 research outputs found

    A Three-phased Online Association Rule Mining Approach for Diverse Mining Requests

    Get PDF
    In the past, most incremental mining and online mining algorithms considered finding the set of association rules or patterns consistent with the entire set of data inserted so far. Users can not easily obtain the results from their only interested portion of data. For providing ad-hoc, query-driven and online mining supports, we first propose a relation called multidimensional pattern relation to structurally and systematically store the context information and the mining information for later analysis. Each tuple in the relation comes from an inserted dataset in the database. This concept is similar to the construction of a data warehouse for OLAP. However, unlike the summarized information of fact attributes in a data warehouse, the mined patterns in the multidimensional pattern relation can not be directly aggregated to satisfy users’ mining requests. We then develop an online mining approach called Three-phased Online Association Rule Mining (TOARM) based on the proposed multidimensional pattern relation to support online generation of association rules under multidimensional considerations. Experiments for both homogeneous and heterogeneous datasets are made, with results showing the effectiveness of the proposed approach

    Data mining with the SAP NetWeaver BI accelerator

    Get PDF
    The new SAP NetWeaver Business Intelligence accelerator is an engine that supports online analytical processing. It performs aggregation in memory and in query runtime over large volumes of structured data. This paper first briefly describes the accelerator and its main architectural features, and cites test results that indicate its power. Then it describes in detail how the accelerator may be used for data mining. The accelerator can perform data mining in the same large repositories of data and using the same compact index structures that it uses for analytical processing. A first such implementation of data mining is described and the results of a performance evaluation are presented. Association rule mining in a distributed architecture was implemented with a variant of the BUC iceberg cubing algorithm. Test results suggest that useful online mining should be possible with wait times of less than 60 seconds on business data that has not been preprocessed

    Heterogeneous data source integration for smart grid ecosystems based on metadata mining

    Get PDF
    The arrival of new technologies related to smart grids and the resulting ecosystem of applications andmanagement systems pose many new problems. The databases of the traditional grid and the variousinitiatives related to new technologies have given rise to many different management systems with several formats and different architectures. A heterogeneous data source integration system is necessary toupdate these systems for the new smart grid reality. Additionally, it is necessary to take advantage of theinformation smart grids provide. In this paper, the authors propose a heterogeneous data source integration based on IEC standards and metadata mining. Additionally, an automatic data mining framework isapplied to model the integrated information.Ministerio de Economía y Competitividad TEC2013-40767-

    BIG DATA MINING FOR INTERESTING PATTERNS WITH MAP REDUCE TECHNIQUE

    Get PDF
    There are many algorithms available in data mining to search interesting patterns from transactional databases of precise data. Frequent pattern mining is a technique to find the frequently occurred items in data mining. Most of the techniques used to find all the interesting patterns from a collection of precise data, where items occurred in each transaction are certainly known to the system. As well as in many real-time applications, users are interested in a tiny portion of large frequent patterns. So the proposed user constrained mining approach, will help to find frequent patterns in which user is interested. This approach will efficiently find user interested frequent patterns by applying user constraints on the collections of uncertain data. The user can specify their own interest in the form of constraints and uses the Map Reduce model to find uncertain frequent pattern that satisfy the user-specified constraintsÂ
    corecore