4,951 research outputs found

    Knowledge-Intensive Processes: Characteristics, Requirements and Analysis of Contemporary Approaches

    Get PDF
    Engineering of knowledge-intensive processes (KiPs) is far from being mastered, since they are genuinely knowledge- and data-centric, and require substantial flexibility, at both design- and run-time. In this work, starting from a scientific literature analysis in the area of KiPs and from three real-world domains and application scenarios, we provide a precise characterization of KiPs. Furthermore, we devise some general requirements related to KiPs management and execution. Such requirements contribute to the definition of an evaluation framework to assess current system support for KiPs. To this end, we present a critical analysis on a number of existing process-oriented approaches by discussing their efficacy against the requirements

    Discovering Business Processes in CRM Systems by leveraging unstructured text data

    Get PDF
    Recent research has proven the feasibility of using Process Mining algorithms to discover business processes from event logs of structured data. However, many IT systems also store a considerable amount of unstructured data. Customer Relationship Management (CRM) Systems typically store information about interactions with customers, such as emails, phone calls, meetings, etc. These activities are characteristically made up of unstructured data, such as a free text subject and description of the interaction, but only limited structured data is available to classify them. This poses a problem to the traditional Process Mining approach that relies on an event log made up of clearly categorised activities. This paper proposes an original framework to mine processes from CRM data, by leveraging the unstructured part of the data. This method uses Latent Dirichlet Allocation (LDA), an unsupervised machine learning technique, to automatically detect and assign labels to activities. This framework does not require any human intervention. A case study with real-world CRM data validates the feasibility of our approach

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Building Data-Driven Pathways From Routinely Collected Hospital Data:A Case Study on Prostate Cancer

    Get PDF
    Background: Routinely collected data in hospitals is complex, typically heterogeneous, and scattered across multiple Hospital Information Systems (HIS). This big data, created as a byproduct of health care activities, has the potential to provide a better understanding of diseases, unearth hidden patterns, and improve services and cost. The extent and uses of such data rely on its quality, which is not consistently checked, nor fully understood. Nevertheless, using routine data for the construction of data-driven clinical pathways, describing processes and trends, is a key topic receiving increasing attention in the literature. Traditional algorithms do not cope well with unstructured processes or data, and do not produce clinically meaningful visualizations. Supporting systems that provide additional information, context, and quality assurance inspection are needed. Objective: The objective of the study is to explore how routine hospital data can be used to develop data-driven pathways that describe the journeys that patients take through care, and their potential uses in biomedical research; it proposes a framework for the construction, quality assessment, and visualization of patient pathways for clinical studies and decision support using a case study on prostate cancer. Methods: Data pertaining to prostate cancer patients were extracted from a large UK hospital from eight different HIS, validated, and complemented with information from the local cancer registry. Data-driven pathways were built for each of the 1904 patients and an expert knowledge base, containing rules on the prostate cancer biomarker, was used to assess the completeness and utility of the pathways for a specific clinical study. Software components were built to provide meaningful visualizations for the constructed pathways. Results: The proposed framework and pathway formalism enable the summarization, visualization, and querying of complex patient-centric clinical information, as well as the computation of quality indicators and dimensions. A novel graphical representation of the pathways allows the synthesis of such information. Conclusions: Clinical pathways built from routinely collected hospital data can unearth information about patients and diseases that may otherwise be unavailable or overlooked in hospitals. Data-driven clinical pathways allow for heterogeneous data (ie, semistructured and unstructured data) to be collated over a unified data model and for data quality dimensions to be assessed. This work has enabled further research on prostate cancer and its biomarkers, and on the development and application of methods to mine, compare, analyze, and visualize pathways constructed from routine data. This is an important development for the reuse of big data in hospitals

    Mining Disease Courses across Organizations: A Methodology Based on Process Mining of Diagnosis Events Datasets

    Get PDF
    BerlĂ­n (Alemania) (23-27 julio 2019)This work was supported in part by grants TRA2015-63708-R and TRA2016-78886-C3-1-R (Spanish Government) and Topus (Madrid Regional Government)

    Case and Activity Identification for Mining Process Models from Middleware

    Get PDF
    Process monitoring aims to provide transparency over operational aspects of a business process. In practice, it is a challenge that traces of business process executions span across a number of diverse systems. It is cumbersome manual engineering work to identify which attributes in unstructured event data can serve as case and activity identifiers for extracting and monitoring the business process. Approaches from literature assume that these identifiers are known a priori and data is readily available in formats like eXtensible Event Stream (XES). However, in practice this is hardly the case, specifically when event data from different sources are pooled together in event stores. In this paper, we address this research gap by inferring potential case and activity identifiers in a provenance agnostic way. More specifically, we propose a semi-automatic technique for discovering event relations that are semantically relevant for business process monitoring. The results are evaluated in an industry case study with an international telecommunication provider

    Valuable Business Knowledge Asset Discovery by Processing Unstructured Data.

    Get PDF
    Modern organizations are challenged to enact a digital transformation and improve their competitiveness while contributing to the ninth Sustainable Development Goal (SGD), “Build resilient infrastructure, promote sustainable industrialization and foster innovation”. The discovery of hidden process data’s knowledge assets may help to digitalize processes. Working on a valuable knowledge asset discovery process, we found a major challenge in that organizational data and knowledge are likely to be unstructured and undigitized, constraining the power of today’s process mining methodologies (PMM). Whereas it has been proved in digitally mature companies, the scope of PMM becomes wider with the complement proposed in this paper, embracing organizations in the process of improving their digital maturity based on available data. We propose the C4PM method, which integrates agile principles, systems thinking and natural language processing techniques to analyze the behavioral patterns of organizational semi-structured or unstructured data from a holistic perspective to discover valuable hidden information and uncover the related knowledge assets aligned with the organization strategic or business goals. Those assets are the key to pointing out potential processes susceptible to be handled using PMM, empowering a sustainable organizational digital transformation. A case study analysis from a dataset containing information on employees’ emails in a multinational company was conducted.post-print5352 K

    Security and Privacy Issues of Big Data

    Get PDF
    This chapter revises the most important aspects in how computing infrastructures should be configured and intelligently managed to fulfill the most notably security aspects required by Big Data applications. One of them is privacy. It is a pertinent aspect to be addressed because users share more and more personal data and content through their devices and computers to social networks and public clouds. So, a secure framework to social networks is a very hot topic research. This last topic is addressed in one of the two sections of the current chapter with case studies. In addition, the traditional mechanisms to support security such as firewalls and demilitarized zones are not suitable to be applied in computing systems to support Big Data. SDN is an emergent management solution that could become a convenient mechanism to implement security in Big Data systems, as we show through a second case study at the end of the chapter. This also discusses current relevant work and identifies open issues.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201
    • …
    corecore