3,595 research outputs found

    Mining Event Logs to Support Workflow Resource Allocation

    Full text link
    Workflow technology is widely used to facilitate the business process in enterprise information systems (EIS), and it has the potential to reduce design time, enhance product quality and decrease product cost. However, significant limitations still exist: as an important task in the context of workflow, many present resource allocation operations are still performed manually, which are time-consuming. This paper presents a data mining approach to address the resource allocation problem (RAP) and improve the productivity of workflow resource management. Specifically, an Apriori-like algorithm is used to find the frequent patterns from the event log, and association rules are generated according to predefined resource allocation constraints. Subsequently, a correlation measure named lift is utilized to annotate the negatively correlated resource allocation rules for resource reservation. Finally, the rules are ranked using the confidence measures as resource allocation rules. Comparative experiments are performed using C4.5, SVM, ID3, Na\"ive Bayes and the presented approach, and the results show that the presented approach is effective in both accuracy and candidate resource recommendations.Comment: T. Liu et al., Mining event logs to support workflow resource allocation, Knowl. Based Syst. (2012), http://dx.doi.org/ 10.1016/j.knosys.2012.05.01

    Graph BI & analytics: current state and future challenges

    Get PDF
    In an increasingly competitive market, making well-informed decisions requires the analysis of a wide range of heterogeneous, large and complex data. This paper focuses on the emerging field of graph warehousing. Graphs are widespread structures that yield a great expressive power. They are used for modeling highly complex and interconnected domains, and efficiently solving emerging big data application. This paper presents the current status and open challenges of graph BI and analytics, and motivates the need for new warehousing frameworks aware of the topological nature of graphs. We survey the topics of graph modeling, management, processing and analysis in graph warehouses. Then we conclude by discussing future research directions and positioning them within a unified architecture of a graph BI and analytics framework.Peer ReviewedPostprint (author's final draft

    Diamond Dicing

    Get PDF
    In OLAP, analysts often select an interesting sample of the data. For example, an analyst might focus on products bringing revenues of at least 100 000 dollars, or on shops having sales greater than 400 000 dollars. However, current systems do not allow the application of both of these thresholds simultaneously, selecting products and shops satisfying both thresholds. For such purposes, we introduce the diamond cube operator, filling a gap among existing data warehouse operations. Because of the interaction between dimensions the computation of diamond cubes is challenging. We compare and test various algorithms on large data sets of more than 100 million facts. We find that while it is possible to implement diamonds in SQL, it is inefficient. Indeed, our custom implementation can be a hundred times faster than popular database engines (including a row-store and a column-store).Comment: 29 page

    Data Mining

    Get PDF

    Data geo-Science Approach for Modelling Unconventional Petroleum Ecosystems and their Visual Analytics

    Get PDF
    Storage, integration and interoperability are critical challenges in the unconventional exploration data management. With a quest to explore unconventional hydrocarbons, in particular, shale gas from fractured shales, we aim at investigating new petroleum data geoscience approaches. The data geo-science describes the integration of geoscience-domain expertise, collaborating mathematical concepts, computing algorithms, machine learning tools, including data and business analytics. Further, to strengthen data-science services among producing companies, we propose an integrated multidimensional repository system, for which factual instances are acquired on gas shales, to store, process and deliver fractured-data views in new knowledge domains. Data dimensions are categorized to examine their suitability in the integrated prototype articulations that use fracture-networks and attribute dimension model descriptions. The factual instances are typically from seismic attributes, seismically interpreted geological structures and reservoirs, well log, including production data entities. For designing and developing multidimensional repository systems, we create various artefacts, describing conceptual, logical and physical models. For exploring the connectivity between seismic and geology entities, multidimensional ontology models are construed using fracture network attribute dimensions and their instances. Different data warehousing and mining are added support to the management of ontologies that can bring the data instances of fractured shales, to unify and explore the associativity between high-dense fractured shales and their orientations. The models depicting collaboration of geology, geophysics, reservoir engineering and geo-mechanics entities and their dimensions can substantially reduce the risk and uncertainty involved in modelling and interpreting shale- and tight-gas reservoirs, including traps associated with Coal Bed Methane (CBM). Anisotropy, Poisson's ratio and Young's modulus properties corroborate the interpretation of stress images from the 3D acoustic characterization of shale reservoirs. The statistical analysis of data-views, their correlations and patterns further facilitate us to visualize and interpret geoscientific metadata meticulously. Data geo-science guided integrated methodology can be applied in any basin, including frontier basins
    corecore