112 research outputs found

    Measuring the impact of COVID-19 on hospital care pathways

    Get PDF
    Care pathways in hospitals around the world reported significant disruption during the recent COVID-19 pandemic but measuring the actual impact is more problematic. Process mining can be useful for hospital management to measure the conformance of real-life care to what might be considered normal operations. In this study, we aim to demonstrate that process mining can be used to investigate process changes associated with complex disruptive events. We studied perturbations to accident and emergency (A &E) and maternity pathways in a UK public hospital during the COVID-19 pandemic. Co-incidentally the hospital had implemented a Command Centre approach for patient-flow management affording an opportunity to study both the planned improvement and the disruption due to the pandemic. Our study proposes and demonstrates a method for measuring and investigating the impact of such planned and unplanned disruptions affecting hospital care pathways. We found that during the pandemic, both A &E and maternity pathways had measurable reductions in the mean length of stay and a measurable drop in the percentage of pathways conforming to normative models. There were no distinctive patterns of monthly mean values of length of stay nor conformance throughout the phases of the installation of the hospital’s new Command Centre approach. Due to a deficit in the available A &E data, the findings for A &E pathways could not be interpreted

    Leveraging metaheuristics with artificial intelligence for customer churn prediction in telecom industries

    Get PDF
    Customer churn prediction (CCP) is among the greatest challenges faced in the telecommunication sector. With progress in the fields of machine learning (ML) and artificial intelligence (AI), the possibility of CCP has dramatically increased. Therefore, this study presents an artificial intelligence with Jaya optimization algorithm based churn prediction for data exploration (AIJOA-CPDE) technique for human-computer interaction (HCI) application. The major aim of the AIJOA-CPDE technique is the determination of churned and non-churned customers. In the AIJOA-CPDE technique, an initial stage of feature selection using the JOA named the JOA-FS technique is presented to choose feature subsets. For churn prediction, the AIJOA-CPDE technique employs a bidirectional long short-term memory (BDLSTM) model. Lastly, the chicken swarm optimization (CSO) algorithm is enforced as a hyperparameter optimizer of the BDLSTM model. A detailed experimental validation of the AIJOA-CPDE technique ensured its superior performance over other existing approaches

    Beyond Flatland : exploring graphs in many dimensions

    Get PDF
    Societies, technologies, economies, ecosystems, organisms, . . . Our world is composed of complex networks—systems with many elements that interact in nontrivial ways. Graphs are natural models of these systems, and scientists have made tremendous progress in developing tools for their analysis. However, research has long focused on relatively simple graph representations and problem specifications, often discarding valuable real-world information in the process. In recent years, the limitations of this approach have become increasingly apparent, but we are just starting to comprehend how more intricate data representations and problem formulations might benefit our understanding of relational phenomena. Against this background, our thesis sets out to explore graphs in five dimensions: descriptivity, multiplicity, complexity, expressivity, and responsibility. Leveraging tools from graph theory, information theory, probability theory, geometry, and topology, we develop methods to (1) descriptively compare individual graphs, (2) characterize similarities and differences between groups of multiple graphs, (3) critically assess the complexity of relational data representations and their associated scientific culture, (4) extract expressive features from and for hypergraphs, and (5) responsibly mitigate the risks induced by graph-structured content recommendations. Thus, our thesis is naturally situated at the intersection of graph mining, graph learning, and network analysis.Gesellschaften, Technologien, Volkswirtschaften, Ökosysteme, Organismen, . . . Unsere Welt besteht aus komplexen Netzwerken—Systemen mit vielen Elementen, die auf nichttriviale Weise interagieren. Graphen sind natürliche Modelle dieser Systeme, und die Wissenschaft hat bei der Entwicklung von Methoden zu ihrer Analyse große Fortschritte gemacht. Allerdings hat sich die Forschung lange auf relativ einfache Graphrepräsentationen und Problemspezifikationen beschränkt, oft unter Vernachlässigung wertvoller Informationen aus der realen Welt. In den vergangenen Jahren sind die Grenzen dieser Herangehensweise zunehmend deutlich geworden, aber wir beginnen gerade erst zu erfassen, wie unser Verständnis relationaler Phänomene von intrikateren Datenrepräsentationen und Problemstellungen profitieren kann. Vor diesem Hintergrund erkundet unsere Dissertation Graphen in fünf Dimensionen: Deskriptivität, Multiplizität, Komplexität, Expressivität, und Verantwortung. Mithilfe von Graphentheorie, Informationstheorie, Wahrscheinlichkeitstheorie, Geometrie und Topologie entwickeln wir Methoden, welche (1) einzelne Graphen deskriptiv vergleichen, (2) Gemeinsamkeiten und Unterschiede zwischen Gruppen multipler Graphen charakterisieren, (3) die Komplexität relationaler Datenrepräsentationen und der mit ihnen verbundenen Wissenschaftskultur kritisch beleuchten, (4) expressive Merkmale von und für Hypergraphen extrahieren, und (5) verantwortungsvoll den Risiken begegnen, welche die Graphstruktur von Inhaltsempfehlungen mit sich bringt. Damit liegt unsere Dissertation naturgemäß an der Schnittstelle zwischen Graph Mining, Graph Learning und Netzwerkanalyse

    Research Paper: Process Mining and Synthetic Health Data: Reflections and Lessons Learnt

    Get PDF
    Analysing the treatment pathways in real-world health data can provide valuable insight for clinicians and decision-makers. However, the procedures for acquiring real-world data for research can be restrictive, time-consuming and risks disclosing identifiable information. Synthetic data might enable representative analysis without direct access to sensitive data. In the first part of our paper, we propose an approach for grading synthetic data for process analysis based on its fidelity to relationships found in real-world data. In the second part, we apply our grading approach by assessing cancer patient pathways in a synthetic healthcare dataset (The Simulacrum provided by the English National Cancer Registration and Analysis Service) using process mining. Visualisations of the patient pathways within the synthetic data appear plausible, showing relationships between events confirmed in the underlying non-synthetic data. Data quality issues are also present within the synthetic data which reflect real-world problems and artefacts from the synthetic dataset’s creation. Process mining of synthetic data in healthcare is an emerging field with novel challenges. We conclude that researchers should be aware of the risks when extrapolating results produced from research on synthetic data to real-world scenarios and assess findings with analysts who are able to view the underlying data

    Exploring the Existing and Unknown Side Effects of Privacy Preserving Data Mining Algorithms

    Get PDF
    The data mining sanitization process involves converting the data by masking the sensitive data and then releasing it to public domain. During the sanitization process, side effects such as hiding failure, missing cost and artificial cost of the data were observed. Privacy Preserving Data Mining (PPDM) algorithms were developed for the sanitization process to overcome information loss and yet maintain data integrity. While these PPDM algorithms did provide benefits for privacy preservation, they also made sure to solve the side effects that occurred during the sanitization process. Many PPDM algorithms were developed to reduce these side effects. There are several PPDM algorithms created based on different PPDM techniques. However, previous studies have not explored or justified why non-traditional side effects were not given much importance. This study reported the findings of the side effects for the PPDM algorithms in a newly created web repository. The research methodology adopted for this study was Design Science Research (DSR). This research was conducted in four phases, which were as follows. The first phase addressed the characteristics, similarities, differences, and relationships of existing side effects. The next phase found the characteristics of non-traditional side effects. The third phase used the Privacy Preservation and Security Framework (PPSF) tool to test if non-traditional side effects occur in PPDM algorithms. This phase also attempted to find additional unknown side effects which have not been found in prior studies. PPDM algorithms considered were Greedy, POS2DT, SIF_IDF, cpGA2DT, pGA2DT, sGA2DT. PPDM techniques associated were anonymization, perturbation, randomization, condensation, heuristic, reconstruction, and cryptography. The final phase involved creating a new online web repository to report all the side effects found for the PPDM algorithms. A Web repository was created using full stack web development. AngularJS, Spring, Spring Boot and Hibernate frameworks were used to build the web application. The results of the study implied various PPDM algorithms and their side effects. Additionally, the relationship and impact that hiding failure, missing cost, and artificial cost have on each other was also understood. Interestingly, the side effects and their relationship with the type of data (sensitive or non-sensitive or new) was observed. As the web repository acts as a quick reference domain for PPDM algorithms. Developing, improving, inventing, and reporting PPDM algorithms is necessary. This study will influence researchers or organizations to report, use, reuse, or develop better PPDM algorithms

    Edge/Fog Computing Technologies for IoT Infrastructure

    Get PDF
    The prevalence of smart devices and cloud computing has led to an explosion in the amount of data generated by IoT devices. Moreover, emerging IoT applications, such as augmented and virtual reality (AR/VR), intelligent transportation systems, and smart factories require ultra-low latency for data communication and processing. Fog/edge computing is a new computing paradigm where fully distributed fog/edge nodes located nearby end devices provide computing resources. By analyzing, filtering, and processing at local fog/edge resources instead of transferring tremendous data to the centralized cloud servers, fog/edge computing can reduce the processing delay and network traffic significantly. With these advantages, fog/edge computing is expected to be one of the key enabling technologies for building the IoT infrastructure. Aiming to explore the recent research and development on fog/edge computing technologies for building an IoT infrastructure, this book collected 10 articles. The selected articles cover diverse topics such as resource management, service provisioning, task offloading and scheduling, container orchestration, and security on edge/fog computing infrastructure, which can help to grasp recent trends, as well as state-of-the-art algorithms of fog/edge computing technologies

    Big Data and Artificial Intelligence in Digital Finance

    Get PDF
    This open access book presents how cutting-edge digital technologies like Big Data, Machine Learning, Artificial Intelligence (AI), and Blockchain are set to disrupt the financial sector. The book illustrates how recent advances in these technologies facilitate banks, FinTech, and financial institutions to collect, process, analyze, and fully leverage the very large amounts of data that are nowadays produced and exchanged in the sector. To this end, the book also describes some more the most popular Big Data, AI and Blockchain applications in the sector, including novel applications in the areas of Know Your Customer (KYC), Personalized Wealth Management and Asset Management, Portfolio Risk Assessment, as well as variety of novel Usage-based Insurance applications based on Internet-of-Things data. Most of the presented applications have been developed, deployed and validated in real-life digital finance settings in the context of the European Commission funded INFINITECH project, which is a flagship innovation initiative for Big Data and AI in digital finance. This book is ideal for researchers and practitioners in Big Data, AI, banking and digital finance

    Bioinformatics Applications Based On Machine Learning

    Get PDF
    The great advances in information technology (IT) have implications for many sectors, such as bioinformatics, and has considerably increased their possibilities. This book presents a collection of 11 original research papers, all of them related to the application of IT-related techniques within the bioinformatics sector: from new applications created from the adaptation and application of existing techniques to the creation of new methodologies to solve existing problems

    On Privacy-Enhanced Distributed Analytics in Online Social Networks

    Get PDF
    More than half of the world's population benefits from online social network (OSN) services. A considerable part of these services is mainly based on applying analytics on user data to infer their preferences and enrich their experience accordingly. At the same time, user data is monetized by service providers to run their business models. Therefore, providers tend to extensively collect (personal) data about users. However, this data is oftentimes used for various purposes without informed consent of the users. Providers share this data in different forms with third parties (e.g., data brokers). Moreover, user sensitive data was repeatedly a subject of unauthorized access by malicious parties. These issues have demonstrated the insufficient commitment of providers to user privacy, and consequently, raised users' concerns. Despite the emergence of privacy regulations (e.g., GDPR and CCPA), recent studies showed that user personal data collection and sharing sensitive data are still continuously increasing. A number of privacy-friendly OSNs have been proposed to enhance user privacy by reducing the need for central service providers. However, this improvement in privacy protection usually comes at the cost of losing social connectivity and many analytics-based services of the wide-spread OSNs. This dissertation addresses this issue by first proposing an approach to privacy-friendly OSNs that maintains established social connections. Second, approaches that allow users to collaboratively apply distributed analytics while preserving their privacy are presented. Finally, the dissertation contributes to better assessment and mitigation of the risks associated with distributed analytics. These three research directions are treated through the following six contributions. Conceptualizing Hybrid Online Social Networks: We conceptualize a hybrid approach to privacy-friendly OSNs, HOSN. This approach combines the benefits of using COSNs and DOSN. Users can maintain their social experience in their preferred COSN while being provided with additional means to enhance their privacy. Users can seamlessly post public content or private content that is accessible only by authorized users (friends) beyond the reach of the service providers. Improving the Trustworthiness of HOSNs: We conceptualize software features to address users' privacy concerns in OSNs. We prototype these features in our HOSN}approach and evaluate their impact on the privacy concerns and the trustworthiness of the approach. Also, we analyze the relationships between four important aspects that influence users' behavior in OSNs: privacy concerns, trust beliefs, risk beliefs, and the willingness to use. Privacy-Enhanced Association Rule Mining: We present an approach to enable users to apply efficiently privacy-enhanced association rule mining on distributed data. This approach can be employed in DOSN and HOSN to generate recommendations. We leverage a privacy-enhanced distributed graph sampling method to reduce the data required for the mining and lower the communication and computational overhead. Then, we apply a distributed frequent itemset mining algorithm in a privacy-friendly manner. Privacy Enhancements on Federated Learning (FL): We identify several privacy-related issues in the emerging distributed machine learning technique, FL. These issues are mainly due to the centralized nature of this technique. We discuss tackling these issues by applying FL in a hierarchical architecture. The benefits of this approach include a reduction in the centralization of control and the ability to place defense and verification methods more flexibly and efficiently within the hierarchy. Systematic Analysis of Threats in Federated Learning: We conduct a critical study of the existing attacks in FL to better understand the actual risk of these attacks under real-world scenarios. First, we structure the literature in this field and show the research foci and gaps. Then, we highlight a number of issues in (1) the assumptions commonly made by researchers and (2) the evaluation practices. Finally, we discuss the implications of these issues on the applicability of the proposed attacks and recommend several remedies. Label Leakage from Gradients: We identify a risk of information leakage when sharing gradients in FL. We demonstrate the severity of this risk by proposing a novel attack that extracts the user annotations that describe the data (i.e., ground-truth labels) from gradients. We show the high effectiveness of the attack under different settings such as different datasets and model architectures. We also test several defense mechanisms to mitigate this attack and conclude the effective ones

    Methoden und Werkzeuge fĂĽr eine datengetriebene Entwicklung digitaler Gesundheitsanwendungen

    Get PDF
    Dem Paradigma der Präzisionsmedizin folgend schaffen digitale Gesundheitsanwendungen die Grundlage für eine personalisierte Versorgung, um damit die Effizienz und Effektivität von Gesundheitssystemen zu erhöhen. Im Kontext weltweit entstehender digitaler Gesundheitsökosysteme stehen dabei Daten als treibender Faktor im Mittelpunkt des Entwicklungsprozesses. Welche Methoden und Werkzeuge benötigt werden, um das dadurch mögliche Zusammenspiel zwischen einer datengetriebenen und einer wissensbasierten Entwicklung von digitalen Gesundheitsanwendungen zu unterstützen, wird in dieser Arbeit untersucht und anhand eines Rahmenwerks beschrieben. Durch Anwendung der Design Science Research Methode werden diesbezügliche Artefakte einem probleminitiierten Ansatz folgend entworfen, implementiert und durch quantitative sowie qualitative Methoden evaluiert. Dafür wird zunächst ein Vorgehensmodell abgeleitet, welches die zu beantwortenden Fragen in den Phasen der Digitalisierung, Automatisierung und Optimierung bis hin zur Translation in die medizinische Versorgung adressiert. Unter Beachtung entsprechender Normen findet eine Verknüpfung von interdisziplinären Methoden, Anforderungen sowie technologischen Ansätzen zu einer Wissensbasis statt, womit die Grundlage für zu entwickelnde Werkzeuge gelegt wird. Diese werden im Anwendungskontext dementieller Syndrome eruiert und pro Artefakt demonstriert sowie im Detail mit nn Probanden multiperspektivisch validiert. In Kooperation mit einer gerontopsychiatrischen Klinik werden diesbezüglich domänenspezifische Anforderungen an digitale Gesundheitsanwendungen bestimmt. Hierfür findet exemplarisch die explorative Entwicklung eines ambulanten Systems zur Messung kognitiver Leistungsparameter statt. Eine im Kontext dieser Zusammenarbeit durchgeführte Feldstudie (n=55n=55) mit kognitiv eingeschränkten Personen zeigt Potentiale und Herausforderungen, welche durch die digitale Erfassung, Vernetzung und Auswertung von neuropsychologischen Daten entstehen. Dabei werden ebenfalls Anforderungen bezüglich der zielgruppenspezifischen Gestaltung einer gebrauchstauglichen Nutzerschnittstelle (n=91n=91) gesammelt, welche in einem Leitfaden zusammenfließen und in einer grafischen Benutzeroberfläche iterativ implementiert werden. Aus der Perspektive von Datensubjekten (n=238n=238) wird zusätzlich untersucht, welchen Stellenwert ein selbstbestimmter Umgang mit dieser Art von personenbezogenen Daten hat und für welche Zwecke diese aus deren Sicht eingesetzt werden sollten. Im Zuge dieses Entwicklungsprozesses sind ebenfalls Ansätze zur Automatisierung und Optimierung der Datenauswertung für die Ableitung des Gesundheitszustandes notwendig. Diese Schritte liefern als Artefakte, neben den Ergebnissen zum Vergleich verschiedener Algorithmen aus dem Bereich des maschinellen Lernens, die Identifikation von dafür geeigneten Leistungs- und Optimierungsmaßen sowie Merkmalsselektionsverfahren. Im Vergleich mit schwellwertbasierten Verfahren zur Operationalisierung von Bewertungsmetriken (maximaler Cohen\u27s Kappa κ=0,67\kappa = 0,67) erreicht die durch maschinelles Lernen gestützte Softwareanwendung eine höhere durchschnittliche Sensitivität von 83% bei einer 93%igen Spezifität (maximaler Cohen\u27s Kappa κ=0,79\kappa = 0,79) für die Erkennung von kognitiven Einschränkungen. Die automatisierte Erfassung hierfür notwendiger Merkmale erfolgt durch neu entwickelte Ansätze und zeigt zukünftige Forschungsaktivitäten auf, welche die damit verbundenen Herausforderungen adressieren. Dabei werden Indikatoren identifiziert, wodurch sich die Potentiale in computergestützten Modellen aufzeigen. Diese liefern zusätzliche Erkenntnisse über das Spannungsfeld zwischen einer zuverlässigen Erfüllung klinischer Leitlinien sowie regulatorischer Implikationen insbesondere hinsichtlich der Erklärbarkeit datengetriebener Optimierungs- und Automatisierungsansätze. Eine Untersuchung der Transferpotentiale in die deutsche Regelversorgung aus der Perspektive unterschiedlicher Interessenvertreter unterstreicht diese Punkte. Hierfür konzipierte Werkzeuge und Methoden ermöglichen einerseits die empirische Untersuchung der Adhärenz solcher digitaler Lösungen bezüglich der Nutzungsbereitschaft (n=29n=29) sowie deren zeitliche Entwicklung (n=18n=18). Andererseits werden damit die Akzeptanzkriterien der kassenärztlich organisierten Leistungserbringer im deutschen Gesundheitswesen (n=301n=301) erhoben und dargestellt, welchen Einfluss diese auf Markteintrittsstrategien haben. Darauf aufbauend werden Wege definiert, um einen Beitrag zur Entlastung des Gesundheitssystems zu leisten. Die gesammelten Erkenntnisse werden hierfür in einem ganzheitlichen Plattformkonzept zur Entwicklung personalisierter Präventions- und Behandlungsprogramme gebündelt
    • …
    corecore