14 research outputs found

    Agent Miner: An Algorithm for Discovering Agent Systems from Event Data

    Full text link
    Process discovery studies ways to use event data generated by business processes and recorded by IT systems to construct models that describe the processes. Existing discovery algorithms are predominantly concerned with constructing process models that represent the control flow of the processes. Agent system mining argues that business processes often emerge from interactions of autonomous agents and uses event data to construct models of the agents and their interactions. This paper presents and evaluates Agent Miner, an algorithm for discovering models of agents and their interactions from event data composing the system that has executed the processes which generated the input data. The conducted evaluation using our open-source implementation of Agent Miner and publicly available industrial datasets confirms that our algorithm can provide insights into the process participants and their interaction patterns and often discovers models that describe the business processes more faithfully than process models discovered using conventional process discovery algorithms

    Capacity Allocation for Clouds with Parallel Processing, Batch Arrivals, and Heterogeneous Service Requirements

    Full text link
    Problem Definition: Allocating sufficient capacity to cloud services is a challenging task, especially when demand is time-varying, heterogeneous, contains batches, and requires multiple types of resources for processing. In this setting, providers decide whether to reserve portions of their capacity to individual job classes or to offer it in a flexible manner. Methodology/results: In collaboration with Huawei Cloud, a worldwide provider of cloud services, we propose a heuristic policy that allocates multiple types of resources to jobs and also satisfies their pre-specified service level agreements (SLAs). We model the system as a multi-class queueing network with parallel processing and multiple types of resources, where arrivals (i.e., virtual machines and containers) follow time-varying patterns and require at least one unit of each resource for processing. While virtual machines leave if they are not served immediately, containers can join a queue. We introduce a diffusion approximation of the offered load of such system and investigate its fidelity as compared to the observed data. Then, we develop a heuristic approach that leverages this approximation to determine capacity levels that satisfy probabilistic SLAs in the system with fully flexible servers. Managerial Implications: Using a data set of cloud computing requests over a representative 8-day period from Huawei Cloud, we show that our heuristic policy results in a 20% capacity reduction and better service quality as compared to a benchmark that reserves resources. In addition, we show that the system utilization induced by our policy is superior to the benchmark, i.e., it implies less idling of resources in most instances. Thus, our approach enables cloud operators to both reduce costs and achieve better performance

    Towards Higher Maturity for Machine Learning: A Conceptual Modelling Approach

    No full text
    Integrating machine learning (ML) applications into business settings presents challenges for many organizations despite rapid advances in ML technologies. There is a lack of systematic guidance in integrating ML into business applications. Conceptual modelling techniques have been used widely to analyze and enhance information systems. This paper outlines how conceptual modelling can be used to help organizations design their processes in order to fit their different needs when integrating ML into business applications. We identify the characteristics and challenges of ML as well as demonstrate how conceptual modelling can be applied to enhance ML integration processes and assist them in meeting these challenges

    From knowledge-driven to data-driven inter-case feature encoding in predictive process monitoring

    No full text
    Predictive process monitoring (PPM) is a research area that focuses on predicting measures of interest (e.g., the completion time) for running cases based on event logs. State-of-the-art PPM techniques only consider intra-case information that comes from the case whose measures of interest one wishes to predict. However, in many systems, the outcome of a running case depends on the interplay of all cases that are being executed concurrently, or can be derived from the characteristics of cases that are executed in the same period of time. For example, in many situations, running cases compete over scarce resources, and the completion time of a running case can be derived from the number of similar cases running concurrently. In this work, we present a general framework for feature encoding that relies on a bi-dimensional state space representation. The first dimension corresponds to intra-case dependencies and utilizes existing feature encoding techniques. The second dimension encodes inter-case features using two approaches: (1) a knowledge-driven encoding (KDE), which assumes prior knowledge on case types, and (2) a data-driven encoding (DDE), which automatically identifies case types from data using case proximity metrics. Both approaches partition the event log into sets of cases that share common characteristics, and derive features according to these commonalities. We demonstrate the usefulness of the proposed framework with an empirical evaluation carried out against two real-life datasets coming from an outpatient hospital process and a manufacturing process

    Queue Mining -Predicting Delays in Service Processes

    No full text
    Abstract. Information systems have been widely adopted to support service processes in various domains, e.g., in the telecommunication, finance, and health sectors. Recently, work on process mining showed how management of these processes, and engineering of supporting systems, can be guided by models extracted from the event logs that are recorded during process operation. In this work, we establish a queueing perspective in operational process mining. We propose to consider queues as first-class citizens and use queueing theory as a basis for queue mining techniques. To demonstrate the value of queue mining, we revisit the specific operational problem of online delay prediction: using event data, we show that queue mining yields accurate online predictions of case delay

    Learning Scheduling Models from Event Data

    No full text
    A significant challenge in declarative approaches to scheduling is the creation of a model: the set of resources and their capacities and the types of activities and their temporal and resource requirements. In practice, such models are developed manually by skilled consultants and used repeatedly to solve different problem instances. For example, in a factory, the model may be used each day to schedule the current customer orders. In this work, we aim to automate the creation of such models by learning them from event data. We introduce a novel methodology that combines process mining, timed Petri nets (TPNs), and constraint programming (CP). The approach learns a sub-class of TPN from event logs of executions of past schedules and maps the TPN to a broad class of scheduling problems. We show how any problem of the scheduling class can be converted to a CP model. With new instance data (e.g., the day’s orders), the CP model can then be solved by an off-the-shelf solver. Our approach provides an end-to-end solution, going from event logs to model-based optimal schedules. To demonstrate the value of the methodology we conduct experiments in which we learn and solve scheduling models from two types of data: logs generated from job-shop scheduling benchmarks and real-world event logs from an outpatient hospital

    Exploiting Hardware and Software Advances for Quadratic Models of Wind Farm Layout Optimization

    No full text
    A key aspect of the design of a wind farm is the wind farm layout optimization (WFLO) problem: given a wind farm site and information about the wind patterns, the problem is to decide the location of individual wind turbines to maximize energy production subject to proximity restrictions and wake-based interference between turbines. Given the pairwise wake interactions, it is natural to model the energy objective as a quadratic function as, indeed, has been done in some existing optimization approaches. However, state-of-the-art solutions often trade-off between speed in producing designs and quality in terms of finding and proving optimal solutions. In this work, we aim to find a balanced approach to obtain WFLO solutions quickly for interactive design and solve the problem to optimality when quality is more important. To this end, we exploit recent advances in optimization hardware and software that target quadratic constraints: commercial mixed integer linear solvers have been extended to address some quadratic problems and nascent specialized hardware, including quantum computing systems, have focused on solving quadratic unconstrained binary optimization (QUBO) problems. We introduce two novel quadratic programming models for WFLO: a quadratic constrained optimization problem (QCOP) with binary decision variables and a QUBO. A thorough numerical evaluation using a commercial solver and specialized QUBO hardware show that our quadratic framework achieves fast, high-quality solutions that improve the state of the art and strike a balance between speed and quality. In particular, the QUBO model delivers high quality solutions in a few seconds while the QCOP model can be used to find better solutions and provide quality guarantees over a longer run-time

    Learning Scheduling Models from Event Data

    No full text
    A significant challenge in declarative approaches to scheduling is the creation of a model: the set of resources and their capacities and the types of activities and their temporal and resource requirements. In practice, such models are developed manually by skilled consultants and used repeatedly to solve different problem instances. For example, in a factory, the model may be used each day to schedule the current customer orders. In this work, we aim to automate the creation of such models by learning them from event data. We introduce a novel methodology that combines process mining, timed Petri nets (TPNs), and constraint programming (CP). The approach learns a sub-class of TPN from event logs of executions of past schedules and maps the TPN to a broad class of scheduling problems. We show how any problem of the scheduling class can be converted to a CP model. With new instance data (e.g., the day’s orders), the CP model can then be solved by an off-the-shelf solver. Our approach provides an end-to-end solution, going from event logs to model-based optimal schedules. To demonstrate the value of the methodology we conduct experiments in which we learn and solve scheduling models from two types of data: logs generated from job-shop scheduling benchmarks and real-world event logs from an outpatient hospital

    Interval-based queries over lossy IoT event streams

    Full text link
    Recognising patterns that correlate multiple events over time becomes increasingly important in applications that exploit the Internet of Things, reaching from urban transportation, through surveillance monitoring to business workflows. In many real-world scenarios, however, timestamps of events may be erroneously recorded and events may be dropped from a stream due to network failures or load shedding policies. In this work, we present SimpMatch, a novel simplex-based algorithm for probabilistic evaluation of event queries using constraints over event orderings in a stream. Our approach avoids learning probability distributions for time-points or occurrence intervals. Instead, we employ the abstraction of segmented intervals and compute the probability of a sequence of such segments using the notion of order statistics. The algorithm runs in linear time to the number of lost events, and shows high accuracy, yielding exact results if event generation is based on a Poisson process and providing a good approximation otherwise. We demonstrate empirically that SimpMatch enables efficient and effective reasoning over event streams, outperforming state-of-the-art methods for probabilistic evaluation of event queries by up to two orders of magnitude
    corecore