929 research outputs found

    Corporate Smart Content Evaluation

    Get PDF
    Nowadays, a wide range of information sources are available due to the evolution of web and collection of data. Plenty of these information are consumable and usable by humans but not understandable and processable by machines. Some data may be directly accessible in web pages or via data feeds, but most of the meaningful existing data is hidden within deep web databases and enterprise information systems. Besides the inability to access a wide range of data, manual processing by humans is effortful, error-prone and not contemporary any more. Semantic web technologies deliver capabilities for machine-readable, exchangeable content and metadata for automatic processing of content. The enrichment of heterogeneous data with background knowledge described in ontologies induces re-usability and supports automatic processing of data. The establishment of “Corporate Smart Content” (CSC) - semantically enriched data with high information content with sufficient benefits in economic areas - is the main focus of this study. We describe three actual research areas in the field of CSC concerning scenarios and datasets applicable for corporate applications, algorithms and research. Aspect- oriented Ontology Development advances modular ontology development and partial reuse of existing ontological knowledge. Complex Entity Recognition enhances traditional entity recognition techniques to recognize clusters of related textual information about entities. Semantic Pattern Mining combines semantic web technologies with pattern learning to mine for complex models by attaching background knowledge. This study introduces the afore-mentioned topics by analyzing applicable scenarios with economic and industrial focus, as well as research emphasis. Furthermore, a collection of existing datasets for the given areas of interest is presented and evaluated. The target audience includes researchers and developers of CSC technologies - people interested in semantic web features, ontology development, automation, extracting and mining valuable information in corporate environments. The aim of this study is to provide a comprehensive and broad overview over the three topics, give assistance for decision making in interesting scenarios and choosing practical datasets for evaluating custom problem statements. Detailed descriptions about attributes and metadata of the datasets should serve as starting point for individual ideas and approaches

    User Interfaces to the Web of Data based on Natural Language Generation

    Get PDF
    We explore how Virtual Research Environments based on Semantic Web technologies support research interactions with RDF data in various stages of corpus-based analysis, analyze the Web of Data in terms of human readability, derive labels from variables in SPARQL queries, apply Natural Language Generation to improve user interfaces to the Web of Data by verbalizing SPARQL queries and RDF graphs, and present a method to automatically induce RDF graph verbalization templates via distant supervision

    Crowd-Based Mining of Reusable Process Model Patterns

    Get PDF
    Process mining is a domain where computers undoubtedly outperform humans. It is a mathematically complex and computationally demanding problem, and event logs are at too low a level of abstraction to be intelligible in large scale to humans. We demonstrate that if instead the data to mine from are models (not logs), datasets are small (in the order of dozens rather than thousands or millions), and the knowledge to be discovered is complex (reusable model patterns), humans outperform computers. We design, implement, run, and test a crowd-based pattern mining approach and demonstrate its viability compared to automated mining. We specifically mine mashup model patterns (we use them to provide interactive recommendations inside a mashup tool) and explain the analogies with mining business process models. The problem is relevant in that reusable model patterns encode valuable modeling and domain knowledge, such as best practices or organizational conventions, from which modelers can learn and benefit when designing own models. © 2014 Springer International Publishing Switzerland

    Leveraging Large Language Models (LLMs) for Process Mining (Technical Report)

    Full text link
    This technical report describes the intersection of process mining and large language models (LLMs), specifically focusing on the abstraction of traditional and object-centric process mining artifacts into textual format. We introduce and explore various prompting strategies: direct answering, where the large language model directly addresses user queries; multi-prompt answering, which allows the model to incrementally build on the knowledge obtained through a series of prompts; and the generation of database queries, facilitating the validation of hypotheses against the original event log. Our assessment considers two large language models, GPT-4 and Google's Bard, under various contextual scenarios across all prompting strategies. Results indicate that these models exhibit a robust understanding of key process mining abstractions, with notable proficiency in interpreting both declarative and procedural process models. In addition, we find that both models demonstrate strong performance in the object-centric setting, which could significantly propel the advancement of the object-centric process mining discipline. Additionally, these models display a noteworthy capacity to evaluate various concepts of fairness in process mining. This opens the door to more rapid and efficient assessments of the fairness of process mining event logs, which has significant implications for the field. The integration of these large language models into process mining applications may open new avenues for exploration, innovation, and insight generation in the field

    Process Mining Workshops

    Get PDF
    This open access book constitutes revised selected papers from the International Workshops held at the Third International Conference on Process Mining, ICPM 2021, which took place in Eindhoven, The Netherlands, during October 31–November 4, 2021. The conference focuses on the area of process mining research and practice, including theory, algorithmic challenges, and applications. The co-located workshops provided a forum for novel research ideas. The 28 papers included in this volume were carefully reviewed and selected from 65 submissions. They stem from the following workshops: 2nd International Workshop on Event Data and Behavioral Analytics (EDBA) 2nd International Workshop on Leveraging Machine Learning in Process Mining (ML4PM) 2nd International Workshop on Streaming Analytics for Process Mining (SA4PM) 6th International Workshop on Process Querying, Manipulation, and Intelligence (PQMI) 4th International Workshop on Process-Oriented Data Science for Healthcare (PODS4H) 2nd International Workshop on Trust, Privacy, and Security in Process Analytics (TPSA) One survey paper on the results of the XES 2.0 Workshop is included

    Process mining : conformance and extension

    Get PDF
    Today’s business processes are realized by a complex sequence of tasks that are performed throughout an organization, often involving people from different departments and multiple IT systems. For example, an insurance company has a process to handle insurance claims for their clients, and a hospital has processes to diagnose and treat patients. Because there are many activities performed by different people throughout the organization, there is a lack of transparency about how exactly these processes are executed. However, understanding the process reality (the "as is" process) is the first necessary step to save cost, increase quality, or ensure compliance. The field of process mining aims to assist in creating process transparency by automatically analyzing processes based on existing IT data. Most processes are supported by IT systems nowadays. For example, Enterprise Resource Planning (ERP) systems such as SAP log all transaction information, and Customer Relationship Management (CRM) systems are used to keep track of all interactions with customers. Process mining techniques use these low-level log data (so-called event logs) to automatically generate process maps that visualize the process reality from different perspectives. For example, it is possible to automatically create process models that describe the causal dependencies between activities in the process. So far, process mining research has mostly focused on the discovery aspect (i.e., the extraction of models from event logs). This dissertation broadens the field of process mining to include the aspect of conformance and extension. Conformance aims at the detection of deviations from documented procedures by comparing the real process (as recorded in the event log) with an existing model that describes the assumed or intended process. Conformance is relevant for two reasons: 1. Most organizations document their processes in some form. For example, process models are created manually to understand and improve the process, comply with regulations, or for certification purposes. In the presence of existing models, it is often more important to point out the deviations from these existing models than to discover completely new models. Discrepancies emerge because business processes change, or because the models did not accurately reflect the real process in the first place (due to the manual and subjective creation of these models). If the existing models do not correspond to the actual processes, then they have little value. 2. Automatically discovered process models typically do not completely "fit" the event logs from which they were created. These discrepancies are due to noise and/or limitations of the used discovery techniques. Furthermore, in the context of complex and diverse process environments the discovered models often need to be simplified to obtain useful insights. Therefore, it is crucial to be able to check how much a discovered process model actually represents the real process. Conformance techniques can be used to quantify the representativeness of a mined model before drawing further conclusions. They thus constitute an important quality measurement to effectively use process discovery techniques in a practical setting. Once one is confident in the quality of an existing or discovered model, extension aims at the enrichment of these models by the integration of additional characteristics such as time, cost, or resource utilization. By extracting aditional information from an event log and projecting it onto an existing model, bottlenecks can be highlighted and correlations with other process perspectives can be identified. Such an integrated view on the process is needed to understand root causes for potential problems and actually make process improvements. Furthermore, extension techniques can be used to create integrated simulation models from event logs that resemble the real process more closely than manually created simulation models. In Part II of this thesis, we provide a comprehensive framework for the conformance checking of process models. First, we identify the evaluation dimensions fitness, decision/generalization, and structure as the relevant conformance dimensions.We develop several Petri-net based approaches to measure conformance in these dimensions and describe five case studies in which we successfully applied these conformance checking techniques to real and artificial examples. Furthermore, we provide a detailed literature review of related conformance measurement approaches (Chapter 4). Then, we study existing model evaluation approaches from the field of data mining. We develop three data mining-inspired evaluation approaches for discovered process models, one based on Cross Validation (CV), one based on the Minimal Description Length (MDL) principle, and one using methods based on Hidden Markov Models (HMMs). We conclude that process model evaluation faces similar yet different challenges compared to traditional data mining. Additional challenges emerge from the sequential nature of the data and the higher-level process models, which include concurrent dynamic behavior (Chapter 5). Finally, we point out current shortcomings and identify general challenges for conformance checking techniques. These challenges relate to the applicability of the conformance metric, the metric quality, and the bridging of different process modeling languages. We develop a flexible, language-independent conformance checking approach that provides a starting point to effectively address these challenges (Chapter 6). In Part III, we develop a concrete extension approach, provide a general model for process extensions, and apply our approach for the creation of simulation models. First, we develop a Petri-net based decision mining approach that aims at the discovery of decision rules at process choice points based on data attributes in the event log. While we leverage classification techniques from the data mining domain to actually infer the rules, we identify the challenges that relate to the initial formulation of the learning problem from a process perspective. We develop a simple approach to partially overcome these challenges, and we apply it in a case study (Chapter 7). Then, we develop a general model for process extensions to create integrated models including process, data, time, and resource perspective.We develop a concrete representation based on Coloured Petri-nets (CPNs) to implement and deploy this model for simulation purposes (Chapter 8). Finally, we evaluate the quality of automatically discovered simulation models in two case studies and extend our approach to allow for operational decision making by incorporating the current process state as a non-empty starting point in the simulation (Chapter 9). Chapter 10 concludes this thesis with a detailed summary of the contributions and a list of limitations and future challenges. The work presented in this dissertation is supported and accompanied by concrete implementations, which have been integrated in the ProM and ProMimport frameworks. Appendix A provides a comprehensive overview about the functionality of the developed software. The results presented in this dissertation have been presented in more than twenty peer-reviewed scientific publications, including several high-quality journals
    • …
    corecore