558 research outputs found

    A computational framework for unsupervised analysis of everyday human activities

    Get PDF
    In order to make computers proactive and assistive, we must enable them to perceive, learn, and predict what is happening in their surroundings. This presents us with the challenge of formalizing computational models of everyday human activities. For a majority of environments, the structure of the in situ activities is generally not known a priori. This thesis therefore investigates knowledge representations and manipulation techniques that can facilitate learning of such everyday human activities in a minimally supervised manner. A key step towards this end is finding appropriate representations for human activities. We posit that if we chose to describe activities as finite sequences of an appropriate set of events, then the global structure of these activities can be uniquely encoded using their local event sub-sequences. With this perspective at hand, we particularly investigate representations that characterize activities in terms of their fixed and variable length event subsequences. We comparatively analyze these representations in terms of their representational scope, feature cardinality and noise sensitivity. Exploiting such representations, we propose a computational framework to discover the various activity-classes taking place in an environment. We model these activity-classes as maximally similar activity-cliques in a completely connected graph of activities, and describe how to discover them efficiently. Moreover, we propose methods for finding concise characterizations of these discovered activity-classes, both from a holistic as well as a by-parts perspective. Using such characterizations, we present an incremental method to classify a new activity instance to one of the discovered activity-classes, and to automatically detect if it is anomalous with respect to the general characteristics of its membership class. Our results show the efficacy of our framework in a variety of everyday environments.Ph.D.Committee Chair: Aaron Bobick; Committee Member: Charles Isbell; Committee Member: David Hogg; Committee Member: Irfan Essa; Committee Member: James Reh

    Reservoir Computing for Learning in Structured Domains

    Get PDF
    The study of learning models for direct processing complex data structures has gained an increasing interest within the Machine Learning (ML) community during the last decades. In this concern, efficiency, effectiveness and adaptivity of the ML models on large classes of data structures represent challenging and open research issues. The paradigm under consideration is Reservoir Computing (RC), a novel and extremely efficient methodology for modeling Recurrent Neural Networks (RNN) for adaptive sequence processing. RC comprises a number of different neural models, among which the Echo State Network (ESN) probably represents the most popular, used and studied one. Another research area of interest is represented by Recursive Neural Networks (RecNNs), constituting a class of neural network models recently proposed for dealing with hierarchical data structures directly. In this thesis the RC paradigm is investigated and suitably generalized in order to approach the problems arising from learning in structured domains. The research studies described in this thesis cover classes of data structures characterized by increasing complexity, from sequences, to trees and graphs structures. Accordingly, the research focus goes progressively from the analysis of standard ESNs for sequence processing, to the development of new models for trees and graphs structured domains. The analysis of ESNs for sequence processing addresses the interesting problem of identifying and characterizing the relevant factors which influence the reservoir dynamics and the ESN performance. Promising applications of ESNs in the emerging field of Ambient Assisted Living are also presented and discussed. Moving towards highly structured data representations, the ESN model is extended to deal with complex structures directly, resulting in the proposed TreeESN, which is suitable for domains comprising hierarchical structures, and Graph-ESN, which generalizes the approach to a large class of cyclic/acyclic directed/undirected labeled graphs. TreeESNs and GraphESNs represent both novel RC models for structured data and extremely efficient approaches for modeling RecNNs, eventually contributing to the definition of an RC framework for learning in structured domains. The problem of adaptively exploiting the state space in GraphESNs is also investigated, with specific regard to tasks in which input graphs are required to be mapped into flat vectorial outputs, resulting in the GraphESN-wnn and GraphESN-NG models. As a further point, the generalization performance of the proposed models is evaluated considering both artificial and complex real-world tasks from different application domains, including Chemistry, Toxicology and Document Processing

    Process Mining Workshops

    Get PDF
    This open access book constitutes revised selected papers from the International Workshops held at the Third International Conference on Process Mining, ICPM 2021, which took place in Eindhoven, The Netherlands, during October 31–November 4, 2021. The conference focuses on the area of process mining research and practice, including theory, algorithmic challenges, and applications. The co-located workshops provided a forum for novel research ideas. The 28 papers included in this volume were carefully reviewed and selected from 65 submissions. They stem from the following workshops: 2nd International Workshop on Event Data and Behavioral Analytics (EDBA) 2nd International Workshop on Leveraging Machine Learning in Process Mining (ML4PM) 2nd International Workshop on Streaming Analytics for Process Mining (SA4PM) 6th International Workshop on Process Querying, Manipulation, and Intelligence (PQMI) 4th International Workshop on Process-Oriented Data Science for Healthcare (PODS4H) 2nd International Workshop on Trust, Privacy, and Security in Process Analytics (TPSA) One survey paper on the results of the XES 2.0 Workshop is included

    Process Mining Handbook

    Get PDF
    This is an open access book. This book comprises all the single courses given as part of the First Summer School on Process Mining, PMSS 2022, which was held in Aachen, Germany, during July 4-8, 2022. This volume contains 17 chapters organized into the following topical sections: Introduction; process discovery; conformance checking; data preprocessing; process enhancement and monitoring; assorted process mining topics; industrial perspective and applications; and closing

    Human-aware application of data science techniques

    Get PDF
    In recent years there has been an increase in the use of artificial intelligence and other data-based techniques to automate decision-making in companies, and discover new knowledge in research. In many cases, all this has been performed using very complex algorithms (so-called black-box algorithms), which are capable of detecting very complex patterns, but unfortunately remain nearly uninterpretable. Recently, many researchers and regulatory institutions have begun to raise awareness of their use. On the one hand, the subjects who depend on these decisions are increasingly questioning their use, as they may be victims of biases or erroneous predictions. On the other hand, companies and institutions that use these algorithms want to understand what their algorithm does, extract new knowledge, and prevent errors and improve their predictions in general. All this has meant that researchers have started to focus on the interpretability of their algorithms (for example, through explainable algorithms), and regulatory institutions have started to regulate the use of the data to ensure ethical aspects such as accountability or fairness. This thesis brings together three data science projects in which black-box predictive machine learning has been implemented to make predictions: - The development of an NTL detection system for an international utility company from Spain (Naturgy). We combine a black-box algorithm and an explanatory algorithm to guarantee our system's accuracy, transparency, and robustness. Moreover, we focus our efforts on empowering the stakeholder to play an active role in the model training process. - A collaboration with the University of Padova to provide explainability to a Deep Learning-based KPI system currently implemented by the MyInvenio company. - A collaboration between the author of the thesis and the Universitat de Barcelona to implement an AI solution (a black-box algorithm combined with an explanatory algorithm) to a social science problem. The unique characteristics of each project allow us to offer in this thesis a comprehensive analysis of the challenges and problems that exist in order to achieve a fair, transparent, unbiased and generalizable use of data in a data science project. With the feedback arising from the research carried out to provide satisfactory solutions to these three projects, we aim to: - Understand the reasons why a prediction model can be regarded as unfair or untruthful, making the model not generalisable, and the consequences from a technical point of view in terms of low accuracy of the model, but also how this can affect us as a society. - Determine and correct (or at least mitigate) the situations that cause the problems in terms of robustness and fairness of our data. - Assess the difference between the interpretable algorithms and black-box algorithms. Also, evaluate how well the explanatory algorithms can explain the predictions made by the predictive algorithms. - Highlight what the stakeholder's role in guaranteeing a robust model is and how to convert a data-driven approach to solve a predictive problem into a data-informed approach, where the data patterns and the human knowledge are combined to maximize profit.En els últims anys s'ha produït un augment de l'ús de la intel·ligència artificial i altres tècniques basades en dades per automatitzar la presa de decisions en les empreses, i descobrir nous coneixements en la recerca. En molts casos, tot això s'ha realitzat utilitzant algorismes molt complexos (anomenats algorismes de caixa negra), que són capaços de detectar patrons molt complexos, però, per desgràcia, continuen sent gairebé ininterpretables. Recentment, molts investigadors i institucions reguladores han començat a conscienciar sobre el seu ús. D'una banda, els subjectes que depenen d'aquestes decisions estan qüestionant cada vegada més el seu ús, ja que poden ser víctimes de prejudicis o prediccions errònies. D'altra banda, les empreses i institucions que utilitzen aquests algoritmes volen entendre el que fa el seu algorisme, extreure nous coneixements i prevenir errors i millorar les seves prediccions en general. Tot això ha fet que els investigadors hagin començat a centrar-se en la interpretació dels seus algorismes (per exemple, mitjançant algorismes explicables), i les institucions reguladores han començat a regular l'ús de les dades per garantir aspectes ètics com la rendició de comptes o la justícia. Aquesta tesi reuneix tres projectes de ciència de dades en els quals s'ha implementat aprenentatge automàtic amb algorismes de caixa negra per fer prediccions: - El desenvolupament d'un sistema de detecció de NTL (Non-Technical Losses, pèrdues d'energia no tècniques) per a una empresa internacional del sector de l'energia d'Espanya (Naturgy). Aquest sistema combina un algorisme de caixa negra i un algorisme explicatiu per garantir la precisió, la transparència i la robustesa del nostre sistema. A més, centrem els nostres esforços en la capacitació dels treballadors de l'empresa (els "stakeholders") per a exercir un paper actiu en el procés de formació dels models. - Una col·laboració amb la Universitat de Padova per proporcionar l'explicabilitat a un sistema KPI basat en Deep Learning actualment implementat per l'empresa MyInvenio. - Una col·laboració de l'autor de la tesi amb la Universitat de Barcelona per implementar una solució d'AI (un algorisme de caixa negra combinat amb un algorisme explicatiu) a un problema de ciències socials. Les característiques úniques de cada projecte ens permeten oferir en aquesta tesi una anàlisi exhaustiva dels reptes i problemes que existeixen per a aconseguir un ús just, transparent, imparcial i generalitzable de les dades en un projecte de ciència de dades. Amb el feedback obtingut de la recerca realitzada per a oferir solucions satisfactòries a aquests tres projectes, el nostre objectiu és: - Entendre les raons per les quals un model de predicció pot considerar-se injust o poc fiable, fent que el model no sigui generalitzable, i les conseqüències des d'un punt de vista tècnic en termes de baixa precisió del model, però també com pot afectar-nos com a societat. - Determinar i corregir (o almenys mitigar) les situacions que causen els problemes en termes de robustesa i imparcialitat de les nostres dades. - Avaluar la diferència entre els algorismes interpretables i els algorismes de caixa negra. A més, avaluar com els algorismes explicatius poden explicar les prediccions fetes pels algorismes predictius. - Ressaltar el paper de les parts interessades ("Stakeholders") per a garantir un model robust i com convertir un enfocament únicament basat en les dades per resoldre un problema predictiu en un enfocament basat en les dades però complementat amb altres coneixements, on els patrons de dades i el coneixement humà es combinen per maximitzar els beneficis.Postprint (published version

    Process Mining Workshops

    Get PDF
    This open access book constitutes revised selected papers from the International Workshops held at the Third International Conference on Process Mining, ICPM 2021, which took place in Eindhoven, The Netherlands, during October 31–November 4, 2021. The conference focuses on the area of process mining research and practice, including theory, algorithmic challenges, and applications. The co-located workshops provided a forum for novel research ideas. The 28 papers included in this volume were carefully reviewed and selected from 65 submissions. They stem from the following workshops: 2nd International Workshop on Event Data and Behavioral Analytics (EDBA) 2nd International Workshop on Leveraging Machine Learning in Process Mining (ML4PM) 2nd International Workshop on Streaming Analytics for Process Mining (SA4PM) 6th International Workshop on Process Querying, Manipulation, and Intelligence (PQMI) 4th International Workshop on Process-Oriented Data Science for Healthcare (PODS4H) 2nd International Workshop on Trust, Privacy, and Security in Process Analytics (TPSA) One survey paper on the results of the XES 2.0 Workshop is included

    Advances in Robotics, Automation and Control

    Get PDF
    The book presents an excellent overview of the recent developments in the different areas of Robotics, Automation and Control. Through its 24 chapters, this book presents topics related to control and robot design; it also introduces new mathematical tools and techniques devoted to improve the system modeling and control. An important point is the use of rational agents and heuristic techniques to cope with the computational complexity required for controlling complex systems. Through this book, we also find navigation and vision algorithms, automatic handwritten comprehension and speech recognition systems that will be included in the next generation of productive systems developed by man

    Research Paper: Process Mining and Synthetic Health Data: Reflections and Lessons Learnt

    Get PDF
    Analysing the treatment pathways in real-world health data can provide valuable insight for clinicians and decision-makers. However, the procedures for acquiring real-world data for research can be restrictive, time-consuming and risks disclosing identifiable information. Synthetic data might enable representative analysis without direct access to sensitive data. In the first part of our paper, we propose an approach for grading synthetic data for process analysis based on its fidelity to relationships found in real-world data. In the second part, we apply our grading approach by assessing cancer patient pathways in a synthetic healthcare dataset (The Simulacrum provided by the English National Cancer Registration and Analysis Service) using process mining. Visualisations of the patient pathways within the synthetic data appear plausible, showing relationships between events confirmed in the underlying non-synthetic data. Data quality issues are also present within the synthetic data which reflect real-world problems and artefacts from the synthetic dataset’s creation. Process mining of synthetic data in healthcare is an emerging field with novel challenges. We conclude that researchers should be aware of the risks when extrapolating results produced from research on synthetic data to real-world scenarios and assess findings with analysts who are able to view the underlying data
    • …
    corecore