64,835 research outputs found

    Data Mining in a Multidimensional Environment

    Get PDF
    Data Mining and Data Warehousing are two hot topics in the database research area. Until recently, conventional data mining algorithms were primarily developed for a relational environment. But a data warehouse database is based on a multidimensional model. In our paper we apply this basis for a seamless integration of data mining in the multidimensional model for the example of discovering association rules. Furthermore, we propose this method as a userguided technique because of the clear structure both of model and data. We present both the theoretical basis and efficient algorithms for data mining in the multidimensional data model. Our approach uses directly the requirements of dimensions, classifications and sparsity of the cube. Additionally we give heuristics for optimizing the search for rules

    OLEMAR: An Online Environment for Mining Association Rules in Multidimensional Data

    Get PDF
    Data warehouses and OLAP (online analytical processing) provide tools to explore and navigate through data cubes in order to extract interesting information under different perspectives and levels of granularity. Nevertheless, OLAP techniques do not allow the identification of relationships, groupings, or exceptions that could hold in a data cube. To that end, we propose to enrich OLAP techniques with data mining facilities to benefit from the capabilities they offer. In this chapter, we propose an online environment for mining association rules in data cubes. Our environment called OLEMAR (online environment for mining association rules), is designed to extract associations from multidimensional data. It allows the extraction of inter-dimensional association rules from data cubes according to a sum-based aggregate measure, a more general indicator than aggregate values provided by the traditional COUNT measure. In our approach, OLAP users are able to drive a mining process guided by a meta-rule, which meets their analysis objectives. In addition, the environment is based on a formalization, which exploits aggregate measures to revisit the definition of the support and the confidence of discovered rules. This formalization also helps evaluate the interestingness of association rules according to two additional quality measures: lift and loevinger. Furthermore, in order to focus on the discovered associations and validate them, we provide a visual representation based on the graphic semiology principles. Such a representation consists in a graphic encoding of frequent patterns and association rules in the same multidimensional space as the one associated with the mined data cube. We have developed our approach as a component in a general online analysis platform called Miningcubes according to an Apriori-like algorithm, which helps extract inter-dimensional association rules directly from materialized multidimensional structures of data. In order to illustrate the effectiveness and the efficiency of our proposal, we analyze a real-life case study about breast cancer data and conduct performance experimentation of the mining process

    On data integration workflows for an effective management of multidimensional petroleum digital ecosystems in Arabian Gulf Basins

    Get PDF
    Data integration of multiple heterogeneous datasets from multidimensional petroleum digital ecosystems is an effective way, for extracting information and adding value to knowledge domain from multiple producing onshore and offshore basins. At present, data from multiple basins are scattered and unusable for data integration, because of scale and format differences. Ontology based warehousing and mining modeling are recommended for resolving the issues of scaling and formatting of multidimensional datasets, in which case, seismic and well-domain datasets are described. Issues, such as semantics among different data dimensions and their associated attributes are also addressed by Ontology modeling.Intelligent relationships are built among several petroleum system domains (structure, reservoir, source and seal, for example) at global scale and facilitated the integration process among multiple dimensions in a data warehouse environment. For this purpose, integrated workflows are designed for capturing and modeling unknown relationships among petroleum system data attributes in interpretable knowledge domains.This study is an effective approach in mining and interpreting data views drawn from warehoused exploration and production metadata, with special reference to Arabian onshore and offshore basins

    Optimizing of the Balanced Scorecard method for management of mining companies with the use of factor analysis

    Get PDF
    The managers of information age companies cannot rely merely on data derived from past activities of the company and focus on improving existing processes. They need a frame for measuring values that result from strategic goals of the company, a tool, which is focusing on obtaining information about company's current success, as well as finding new driving forces to ensure the future competitiveness of the company. Strategic business performance measurement system the Balanced Scorecard (BSC) is a suitable tool for improving the competitiveness of industrial companies. During its implementation, however, there is a conflict of perception of the importance of individual goals and measurable characteristics in partial perspectives of the BSC and its actual enforcement of the various strategic objectives in companies. The aim of this article is to verify the accuracy of BSC settings in an environment of selected companies in the Moravian-Silesian region with the emphasis placed on mining companies with the help of multidimensional statistics - factor analysis. The research took place in 2015 in cooperation with managers from the Moravian-Silesian Region (MSR), and it was divided into two kinds of research - quantitative and qualitative.Web of Science22444743

    Visualization Techniques For Malware Behavior Analysis

    Get PDF
    Malware spread via Internet is a great security threat, so studying their behavior is important to identify and classify them. Using SSDT hooking we can obtain malware behavior by running it in a controlled environment and capturing interactions with the target operating system regarding file, process, registry, network and mutex activities. This generates a chain of events that can be used to compare them with other known malware. In this paper we present a simple approach to convert malware behavior into activity graphs and show some visualization techniques that can be used to analyze malware behavior, individually or grouped. © 2011 SPIE.8019The Society of Photo-Optical Instrumentation Engineers (SPIE)Tufte, E.R., (2001) The Visual Display of Quantitative Information, , Graphic PressKeim, D., Visual data mining. Tutorial (1997) Proc. 23rd International Conference on Very Large Data BasesCleveland, W.S., (1993) Visualizing Data, , Hobart PressGrégio, A.R.A., Aplicação de técnicas de data mining para a análise de logs de tráfego tcp/ip (2007) Applied Computing at INPE - Brazilian Institute for Space Research, , Masters dissertationInselberg, A., The plane with parallel coordinates (1985) The Visual Computer, 1 (2), pp. 69-91Inselberg, A., (2009) Parallel Coordinates - Visual Multidimensional Geometry and its Applications, , SpringerKohonen, T., (1997) Self-Organizing Maps, , SpringerBeddow, J., Shape coding of multidimensional data on a mircocomputer display (1990) Proc. of the First IEEE Conference on Visualization, pp. 238-246Keim, D.A., Kriegel, H.-P., Using visualization to support data mining of large existing databases (1993) Proc. IEEE Visualization '93 WorkshopShneiderman, B., Tree visualization with tree-maps: A 2-D space-filling approach (1991) ACM Transactions on Graphics, 11, pp. 92-99www.shadowserver.orgwww.cert.brwww.cert.br/docs/whitepapers/spambotsCalais, P.H., Pires, D.E.V., Guedes, D.O., Meira Jr., W., Hoepers, C., Steding-Jessen, K., A campaign-based characterization of spamming strategies (2008) Proc. of Fifth Conference on E-mail and Anti-Spa

    Does Firm Ownership Differentiate Environmental Compliance? Evidence from Indian Chromite Mining Industry

    Get PDF
    This paper compares the environmental performance of public and private firms in the context of Indian chromite mining industry. It proposes a new methodology to measure firms’ environmental performance in a multidimensional framework. Comparison of unidimensional and multidimensional environmental defiance indices reveal no significant difference between the public and private firms.Firm ownership, Multi Dimensional Environmental Compliance

    Perspects in astrophysical databases

    Full text link
    Astrophysics has become a domain extremely rich of scientific data. Data mining tools are needed for information extraction from such large datasets. This asks for an approach to data management emphasizing the efficiency and simplicity of data access; efficiency is obtained using multidimensional access methods and simplicity is achieved by properly handling metadata. Moreover, clustering and classification techniques on large datasets pose additional requirements in terms of computation and memory scalability and interpretability of results. In this study we review some possible solutions
    corecore