11 research outputs found
A multilabel classification approach for complex human activities using a combination of emerging patterns and fuzzy sets
In our daily lives, humans perform different Activities of Daily Living (ADL), such as cooking, and studying. According to the nature of humans, they perform these activities in a sequential/simple or an overlapping/complex scenario. Many research attempts addressed simple activity recognition, but complex activity recognition is still a challenging issue. Recognition of complex activities is a multilabel classification problem, such that a test instance is assigned to a multiple overlapping activities. Existing data-driven techniques for complex activity recognition can recognize a maximum number of two overlapping activities and require a training dataset of complex (i.e. multilabel) activities. In this paper, we propose a multilabel classification approach for complex activity recognition using a combination of Emerging Patterns and Fuzzy Sets. In our approach, we require a training dataset of only simple (i.e. single-label) activities. First, we use a pattern mining technique to extract discriminative features called Strong Jumping Emerging Patterns (SJEPs) that exclusively represent each activity. Then, our scoring function takes SJEPs and fuzzy membership values of incoming sensor data and outputs the activity label(s). We validate our approach using two different dataset. Experimental results demonstrate the efficiency and superiority of our approach against other approaches
Improving Localization Accuracy: Successive Measurements Error Modeling
Vehicle self-localization is an essential requirement for many of the safety applications envisioned for vehicular networks. The mathematical models used in current vehicular localization schemes focus on modeling the localization error itself, and overlook the potential correlation between successive localization measurement errors. In this paper, we first investigate the existence of correlation between successive positioning measurements, and then incorporate this correlation into the modeling positioning error. We use the Yule Walker equations to determine the degree of correlation between a vehicle’s future position and its past positions, and then propose a -order Gauss–Markov model to predict the future position of a vehicle from its past  positions. We investigate the existence of correlation for two datasets representing the mobility traces of two vehicles over a period of time. We prove the existence of correlation between successive measurements in the two datasets, and show that the time correlation between measurements can have a value up to four minutes. Through simulations, we validate the robustness of our model and show that it is possible to use the first-order Gauss–Markov model, which has the least complexity, and still maintain an accurate estimation of a vehicle’s future location over time using only its current position. Our model can assist in providing better modeling of positioning errors and can be used as a prediction tool to improve the performance of classical localization algorithms such as the Kalman filter
Data Management for the Internet of Things: Design Primitives and Solution
The Internet of Things (IoT) is a networking paradigm where interconnected, smart objects continuously generate data and transmit it over the Internet. Much of the IoT initiatives are geared towards manufacturing low-cost and energy-efficient hardware for these objects, as well as the communication technologies that provide objects interconnectivity. However, the solutions to manage and utilize the massive volume of data produced by these objects are yet to mature. Traditional database management solutions fall short in satisfying the sophisticated application needs of an IoT network that has a truly global-scale. Current solutions for IoT data management address partial aspects of the IoT environment with special focus on sensor networks. In this paper, we survey the data management solutions that are proposed for IoT or subsystems of the IoT. We highlight the distinctive design primitives that we believe should be addressed in an IoT data management solution, and discuss how they are approached by the proposed solutions. We finally propose a data management framework for IoT that takes into consideration the discussed design elements and acts as a seed to a comprehensive IoT data management solution. The framework we propose adapts a federated, data- and sources-centric approach to link the diverse Things with their abundance of data to the potential applications and services that are envisioned for IoT
XAI in the Context of Predictive Process Monitoring: An Empirical Analysis Framework
Predictive Process Monitoring (PPM) has been integrated into process mining use cases as a value-adding task. PPM provides useful predictions on the future of the running business processes with respect to different perspectives, such as the upcoming activities to be executed next, the final execution outcome, and performance indicators. In the context of PPM, Machine Learning (ML) techniques are widely employed. In order to gain trust of stakeholders regarding the reliability of PPM predictions, eXplainable Artificial Intelligence (XAI) methods have been increasingly used to compensate for the lack of transparency of most of predictive models. Multiple XAI methods exist providing explanations for almost all types of ML models. However, for the same data, as well as, under the same preprocessing settings or same ML models, generated explanations often vary significantly. Corresponding variations might jeopardize the consistency and robustness of the explanations and, subsequently, the utility of the corresponding model and pipeline settings. This paper introduces a framework that enables the analysis of the impact PPM-related settings and ML-model-related choices may have on the characteristics and expressiveness of the generated explanations. Our framework provides a means to examine explanations generated either for the whole reasoning process of an ML model, or for the predictions made on the future of a certain business process instance. Using well-defined experiments with different settings, we uncover how choices made through a PPM workflow affect and can be reflected through explanations. This framework further provides the means to compare how different characteristics of explainability methods can shape the resulting explanations and reflect on the underlying model reasoning process
Missing values imputation using Fuzzy K-Top Matching Value
Missing data occurs when variables or observations are missing. Researchers exclude or impute influenced variables and data. This study proposes Fuzzy K-Top Matching Value (FKTM) for missing value imputation. It imputes missing numerical and categorical data with intelligent estimates based on similar records, decreasing bias. Expectation-maximization is used, where it employs fuzzy clustering to find a group of similar data and estimate them. We compare FKTM with original datasets on Immunotherapy and Cryotherapy. Multiple classification techniques are used on the imputed datasets. Random Forest achieved the best, with 93.3% for cryotherapy and 85.6% for Immunotherapy. The proposed approach is compared with Multivariate Imputation by Chained Equations (MICE) utilizing a Support Vector Machine. The proposed approach beats MICE with 82.2% accuracy. On the Cryotherapy dataset, the proposed approach surpasses existing strategies with 86.6% accuracy. Levene and Shapiro-Wilk were used to examine the homoscedasticity and normality of data after imputation. The proposed imputation procedure has no detrimental influence on the dataset. Finally, execution time and RMSE of imputed values are determined for three datasets with varied sample sizes and data dimensions. The proposed system exhibits a fast execution time and low RMSE. The proposed FKTM works well in experiments and looks promising
Why should I trust your explanation? An evaluation approach for XAI methods applied to predictive process monitoring results
As a use case of process mining, predictive process monitoring (PPM) aims to provide information on the future course of running business process instances. A large number of available PPM approaches adopt predictive models based on machine learning (ML). With the improved efficiency and accuracy of ML models usually being coupled with increasing complexity, their understandability becomes compromised. Having the user at the center of attention, various eXplainable artificial intelligence (XAI) methods emerged to provide users with explanations of the reasoning process of an ML model. Though there is a growing interest in applying XAI methods to PPM results, various proposals have been made to evaluate explanations according to different criteria. In this article, we propose an approach to quantitatively evaluate XAI methods concerning their ability to reflect the facts learned from the underlying stores of business-related data, i.e., event logs. Our approach includes procedures to extract features that are crucial for generating predictions. Moreover, it computes ratios that have proven to be useful in differentiating XAI methods. We conduct experiments that produce useful insights into the effects of the various choices made through a PPM workflow. We can show that underlying data and model issues can be highlighted using the applied XAI methods. Furthermore, we could penalize and reward XAI methods for achieving certain levels of consistency with the facts learned about the underlying data. Our approach has been applied to different real-life event logs using different configurations of the PPM workflow. Impact Statement—As ML models are used to generate predictions for running business process instances, the outcomes of these models should be justifiable to users. To achieve this, explanation methods are applied on top of ML models. However, explanations need to be evaluated concerning the valuable information they convey about the predictive model and their ability to encode underlying data facts. In other words, an explainability method should be evaluated concerning its ability to match model inputs to its outputs. Our approach provides a means to evaluate and compare explainability methods concerned with the global explainability of the entire reasoning process of an ML model. Based on experimental settings, where each step of the PPM workflow is changeable, we could study the ability of our approach to evaluate different combinations of data, preprocessing configurations, modeling, and explanation methods. This approach allows an understanding of which PPM workflow configurations increase the ability of an explanation method to make the prediction process transparent to users
Explainability of Predictive Process Monitoring Results: Can You See My Data Issues?
Predictive process monitoring (PPM) has been discussed as a use case of process mining for several years. PPM enables foreseeing the future of an ongoing business process by predicting, for example, relevant information on the way in which running processes terminate or on related process performance indicators. A large share of PPM approaches adopt Machine Learning (ML), taking advantage of the accuracy and precision of ML models. Consequently, PPM inherits the challenges of traditional ML approaches. One of these challenges concerns the need to gain user trust in the generated predictions. This issue is addressed by explainable artificial intelligence (XAI). However, in addition to ML characteristics, the choices made and the techniques applied in the context of PPM influence the resulting explanations. This necessitates the availability of a study on the effects of different choices made in the context of a PPM task on the explainability of the generated predictions. In order to address this gap, we systemically investigate the effects of different PPM settings on the data fed into an ML model and subsequently into the employed XAI method. We study how differences between the resulting explanations indicate several issues in the underlying data. Example of these issues include collinearity and high dimensionality of the input data. We construct a framework for performing a series of experiments to examine different choices of PPM dimensions (i.e., event logs, preprocessing configurations, and ML models), integrating XAI as a fundamental component. In addition to agreements, the experiments highlight several inconsistencies between data characteristics and important predictors used by the ML model on one hand, and explanations of predictions of the investigated ML model on the other
Probabilistic Forecasting for Oil Producing Wells Using Seq2seq Augmented Model
Time series forecasting is a challenging problem in the field of data mining. Deterministic forecasting has shown limitations in the field. Therefore, researchers are now more inclined towards probabilistic forecasting, which has shown a clear advantage by providing more reliable models. In this paper, we utilize seq2seq machine learning models in order to estimate prediction intervals (PIs) for a large oil production dataset. To evaluate the proposed models, Prediction Interval Coverage Probability (PICP), Prediction Interval Normalized Average Width (PINAW), and Coverage Width-based Criterion (CWC) metrics are used. Our results show that the proposed model can reliably estimate PIs for production forecasting
XAI in the context of predictive process monitoring : an empirical analysis framework
Predictive Process Monitoring (PPM) has been integrated into process mining use cases as a value-adding task. PPM provides useful predictions on the future of the running business processes with respect to different perspectives, such as the upcoming activities to be executed next, the final execution outcome, and performance indicators. In the context of PPM, Machine Learning (ML) techniques are widely employed. In order to gain trust of stakeholders regarding the reliability of PPM predictions, eXplainable Artificial Intelligence (XAI) methods have been increasingly used to compensate for the lack of transparency of most of predictive models. Multiple XAI methods exist providing explanations for almost all types of ML models. However, for the same data, as well as, under the same preprocessing settings or same ML models, generated explanations often vary significantly. Corresponding variations might jeopardize the consistency and robustness of the explanations and, subsequently, the utility of the corresponding model and pipeline settings. This paper introduces a framework that enables the analysis of the impact PPM-related settings and ML-model-related choices may have on the characteristics and expressiveness of the generated explanations. Our framework provides a means to examine explanations generated either for the whole reasoning process of an ML model, or for the predictions made on the future of a certain business process instance. Using well-defined experiments with different settings, we uncover how choices made through a PPM workflow affect and can be reflected through explanations. This framework further provides the means to compare how different characteristics of explainability methods can shape the resulting explanations and reflect on the underlying model reasoning process