As business processes become ever more complex there is a need for companies to understand the processes they already have in place. To undertake this manually would be time consuming. The practice of process mining attempts to automatically construct the correct representation of a process based on a set of process execution logs. The aim of this research is to develop a genetic programming based approach for business process mining. The focus of this research is on automated/semi automated business processes within the service industry (by semi automated it is meant that part of the process is manual and likely to be paper based). This is the first time a GP approach has been used in the practice of process mining. The graph based representation and fitness parsing used are also unique to the GP approach. A literature review and an industry survey have been undertaken as part of this research to establish the state-of-the-art in the research and practice of business process modelling and mining. It is observed that process execution logs exist in most service sector companies are not utilised for process mining. The development of a new GP approach is documented along with a set of modifications required to enable accuracy in the mining of complex process constructs, semantics and noisy process execution logs. In the context of process mining accuracy refers to the ability of the mined model to reflect the contents of the event log on which it is based; neither over describing, including features that are not recorded in the log, or under describing, just including the most common features leaving out low frequency task edges, the contents of the event log. The complexity of processes, in terms of this thesis, involves the mining of parallel constructs, processes containing complex semantic constructs (And/XOR split and join points) and processes containing 20 or more tasks. The level of noise mined by the business process mining approach includes event logs which have a small number of randomly selected tasks missing from a third of their structure. A novel graph representation for use with GP in the mining of business processes is presented along with a new way of parsing graph based individuals against process execution logs. The GP process mining approach has been validated with a range of tests drawn from literature and two case studies, provided by the industrial sponsor, utilising live process data. These tests and case studies provide a range of process constructs to fully test and stretch the GP process mining approach. An outlook is given into the future development of the GP process mining approach and process mining as a practice
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.