Search CORE

7,300 research outputs found

Exploring Interpretability for Predictive Process Analytics

Author: C Rudin
I Teinemaa
I Verenich
J Evermann
J Rehse
JH Friedman
R Guidotti
ZC Lipton
Publication venue
Publication date: 01/01/2020
Field of study

Modern predictive analytics underpinned by machine learning techniques has become a key enabler to the automation of data-driven decision making. In the context of business process management, predictive analytics has been applied to making predictions about the future state of an ongoing business process instance, for example, when will the process instance complete and what will be the outcome upon completion. Machine learning models can be trained on event log data recording historical process execution to build the underlying predictive models. Multiple techniques have been proposed so far which encode the information available in an event log and construct input features required to train a predictive model. While accuracy has been a dominant criterion in the choice of various techniques, they are often applied as a black-box in building predictive models. In this paper, we derive explanations using interpretable machine learning techniques to compare and contrast the suitability of multiple predictive models of high accuracy. The explanations allow us to gain an understanding of the underlying reasons for a prediction and highlight scenarios where accuracy alone may not be sufficient in assessing the suitability of techniques used to encode event log data to features used by a predictive model. Findings from this study motivate the need and importance to incorporate interpretability in predictive process analytics.Comment: 15 pages, 7 figure

arXiv.org e-Print Archive

PMLB: A Large Benchmark Suite for Machine Learning Evaluation and Comparison

Author: La Cava William
Moore Jason H.
Olson Randal S.
Orzechowski Patryk
Urbanowicz Ryan J.
Publication venue
Publication date: 01/03/2017
Field of study

The selection, development, or comparison of machine learning methods in data mining can be a difficult task based on the target problem and goals of a particular study. Numerous publicly available real-world and simulated benchmark datasets have emerged from different sources, but their organization and adoption as standards have been inconsistent. As such, selecting and curating specific benchmarks remains an unnecessary burden on machine learning practitioners and data scientists. The present study introduces an accessible, curated, and developing public benchmark resource to facilitate identification of the strengths and weaknesses of different machine learning methodologies. We compare meta-features among the current set of benchmark datasets in this resource to characterize the diversity of available data. Finally, we apply a number of established machine learning methods to the entire benchmark suite and analyze how datasets and algorithms cluster in terms of performance. This work is an important first step towards understanding the limitations of popular benchmarking suites and developing a resource that connects existing benchmarking standards to more diverse and efficient standards in the future.Comment: 14 pages, 5 figures, submitted for review to JML

arXiv.org e-Print Archive

Directory of Open Access Journals

Expected public and private benefits of embedding farm business performance systems in the Australian and New Zealand dairy industries

Author: Ronan Glenn
Publication venue
Publication date
Field of study

Dairy industry organizations, universities and government agencies are variously involved in embedding web-based, standardized farm business performance systems in the Australian and New Zealand industries. The spectrum of involvement prompts an exploration of demand drivers and expectations of benefits, public and private. Inclusion of South Australian dairy businesses in a web data system as part of implementing the South Australian dairy industry strategic plan is discussed as an example where public and private benefits are expected. To the extent that adoption of the web as a data management platform is an aid to dialogue in the public-private partnership of industry development more detailed research about the systems and their benefits to stakeholders is merited.Farm Management,

On predictability of rare events leveraging social media: a machine learning perspective

Author: Bakliwal A.
Gayo-Avello D.
Go A.
Saif H.
Tumasjan A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/02/2015
Field of study

Information extracted from social media streams has been leveraged to forecast the outcome of a large number of real-world events, from political elections to stock market fluctuations. An increasing amount of studies demonstrates how the analysis of social media conversations provides cheap access to the wisdom of the crowd. However, extents and contexts in which such forecasting power can be effectively leveraged are still unverified at least in a systematic way. It is also unclear how social-media-based predictions compare to those based on alternative information sources. To address these issues, here we develop a machine learning framework that leverages social media streams to automatically identify and predict the outcomes of soccer matches. We focus in particular on matches in which at least one of the possible outcomes is deemed as highly unlikely by professional bookmakers. We argue that sport events offer a systematic approach for testing the predictive power of social media, and allow to compare such power against the rigorous baselines set by external sources. Despite such strict baselines, our framework yields above 8% marginal profit when used to inform simple betting strategies. The system is based on real-time sentiment analysis and exploits data collected immediately before the games, allowing for informed bets. We discuss the rationale behind our approach, describe the learning framework, its prediction performance and the return it provides as compared to a set of betting strategies. To test our framework we use both historical Twitter data from the 2014 FIFA World Cup games, and real-time Twitter data collected by monitoring the conversations about all soccer matches of four major European tournaments (FA Premier League, Serie A, La Liga, and Bundesliga), and the 2014 UEFA Champions League, during the period between Oct. 25th 2014 and Nov. 26th 2014.Comment: 10 pages, 10 tables, 8 figure

arXiv.org e-Print Archive

Care for the Mind Amid Chronic Diseases: An Interpretable AI Approach Using IoT

Author: Fang Xiao
Liu Xiang
Xie Jiaheng
Zhao Xiaohang
Publication venue
Publication date: 08/11/2022
Field of study

Health sensing for chronic disease management creates immense benefits for social welfare. Existing health sensing studies primarily focus on the prediction of physical chronic diseases. Depression, a widespread complication of chronic diseases, is however understudied. We draw on the medical literature to support depression prediction using motion sensor data. To connect human expertise in the decision-making, safeguard trust for this high-stake prediction, and ensure algorithm transparency, we develop an interpretable deep learning model: Temporal Prototype Network (TempPNet). TempPNet is built upon the emergent prototype learning models. To accommodate the temporal characteristic of sensor data and the progressive property of depression, TempPNet differs from existing prototype learning models in its capability of capturing the temporal progression of depression. Extensive empirical analyses using real-world motion sensor data show that TempPNet outperforms state-of-the-art benchmarks in depression prediction. Moreover, TempPNet interprets its predictions by visualizing the temporal progression of depression and its corresponding symptoms detected from sensor data. We further conduct a user study to demonstrate its superiority over the benchmarks in interpretability. This study offers an algorithmic solution for impactful social good - collaborative care of chronic diseases and depression in health sensing. Methodologically, it contributes to extant literature with a novel interpretable deep learning model for depression prediction from sensor data. Patients, doctors, and caregivers can deploy our model on mobile devices to monitor patients' depression risks in real-time. Our model's interpretability also allows human experts to participate in the decision-making by reviewing the interpretation of prediction outcomes and making informed interventions.Comment: 39 pages, 12 figure

arXiv.org e-Print Archive

Evaluation of crime prevention initiatives

Author: Klima Noel
O'Duill Barra
Vanhauwaert Rosita
Wijckmans Belinda
Publication venue: European Crime Prevention Network
Publication date: 01/01/2013
Field of study

This third toolbox in the series published by the EUCPN Secretariat focuses on the main theme of the Irish Presidency, which is the evaluation of crime prevention initiatives. The theme is explored and elaborated in various ways through: a literature review; two workshops with international experts and practitioners during which the strengths and weaknesses of programme evaluation were discussed in detail; a screening of existing guidelines and manuals on evaluation; and finally, a call which was launched by the EUCPN Secretariat to the Member States to collect some practices on the evaluation of crime prevention initiatives