Search CORE

7,031 research outputs found

Ontology-Guided Principal Component Analysis: Reaching the limits of the doctor-in-the-loop

Author: Girardi Dominic
Holzinger Andreas
Kleiser Raimund
Trenkler Johannes
Wartner Sandra
Wiesinger-Widi Manuela
Publication venue
Publication date: 01/01/2016
Field of study

Assembling Algorithmic Decision-Making under Uncertainty: The Case of \u27Edge Cases\u27 in an Open Data Environment

Author: Grønsund Tor
Publication venue: AIS Electronic Library (AISeL)
Publication date: 04/01/2021
Field of study

Algorithmic decision-making is rapidly evolving as a source of data-driven competitive advantage with important implications for analytical practices in multiple settings. Despite the ambitions for algorithmic and intelligent technologies, however, the requirement for quality data input to the algorithm poses a significant challenge for its actual adoption. The trend towards open data might bring additional challenges such as strategic gaming and distortion of meaning. To address this problem, we draw on a two-year long qualitative case study of a firm in international maritime trade to understand the role of uncertainty associated with open data upon the uptake of a novel algorithm. We combine an uncertainty and assemblage perspective to unpack the arrangements by which the organization configures relations of humans and machine to mitigate this problem. We highlight the phenomenon of edge cases as a key challenge for automation and propose that an assemblage of augmentation and automation allows a dynamic arrangement that support the introduction and organization of algorithmic decision-making under uncertainty

AIS Electronic Library (AISeL)

Enabling the human in the loop: Linked data and knowledge in industrial cyber-physical systems

Author: Bertoncelj Luka
Emmanouilidis Christos
Fournaris Apostolos
Katsouros Vassilis
Koulamas Christos
Pistofidis Petros
Ruiz-Carcel Cristobal
Publication venue: 'Elsevier BV'
Publication date: 19/03/2019
Field of study

Industrial Cyber-Physical Systems have benefitted substantially from the introduction of a range of technology enablers. These include web-based and semantic computing, ubiquitous sensing, internet of things (IoT) with multi-connectivity, advanced computing architectures and digital platforms, coupled with edge or cloud side data management and analytics, and have contributed to shaping up enhanced or new data value chains in manufacturing. While parts of such data flows are increasingly automated, there is now a greater demand for more effectively integrating, rather than eliminating, human cognitive capabilities in the loop of production related processes. Human integration in Cyber-Physical environments can already be digitally supported in various ways. However, incorporating human skills and tangible knowledge requires approaches and technological solutions that facilitate the engagement of personnel within technical systems in ways that take advantage or amplify their cognitive capabilities to achieve more effective sociotechnical systems. After analysing related research, this paper introduces a novel viewpoint for enabling human in the loop engagement linked to cognitive capabilities and highlighting the role of context information management in industrial systems. Furthermore, it presents examples of technology enablers for placing the human in the loop at selected application cases relevant to production environments. Such placement benefits from the joint management of linked maintenance data and knowledge, expands the power of machine learning for asset awareness with embedded event detection, and facilitates IoT-driven analytics for product lifecycle management

Cranfield CERES

Speculative Approximations for Terascale Analytics

Author: Qin Chengjie
Rusu Florin
Publication venue
Publication date: 31/12/2014
Field of study

Model calibration is a major challenge faced by the plethora of statistical analytics packages that are increasingly used in Big Data applications. Identifying the optimal model parameters is a time-consuming process that has to be executed from scratch for every dataset/model combination even by experienced data scientists. We argue that the incapacity to evaluate multiple parameter configurations simultaneously and the lack of support to quickly identify sub-optimal configurations are the principal causes. In this paper, we develop two database-inspired techniques for efficient model calibration. Speculative parameter testing applies advanced parallel multi-query processing methods to evaluate several configurations concurrently. The number of configurations is determined adaptively at runtime, while the configurations themselves are extracted from a distribution that is continuously learned following a Bayesian process. Online aggregation is applied to identify sub-optimal configurations early in the processing by incrementally sampling the training dataset and estimating the objective function corresponding to each configuration. We design concurrent online aggregation estimators and define halting conditions to accurately and timely stop the execution. We apply the proposed techniques to distributed gradient descent optimization -- batch and incremental -- for support vector machines and logistic regression models. We implement the resulting solutions in GLADE PF-OLA -- a state-of-the-art Big Data analytics system -- and evaluate their performance over terascale-size synthetic and real datasets. The results confirm that as many as 32 configurations can be evaluated concurrently almost as fast as one, while sub-optimal configurations are detected accurately in as little as a

1/20^{\text{th}}

fraction of the time

arXiv.org e-Print Archive

eScholarship - University of California

An interactive human centered data science approach towards crime pattern analysis

Author: Qazi N.
Qazi N.
Wong B.
Wong B.
Publication venue: Elsevier
Publication date: 01/01/2019
Field of study

The traditional machine learning systems lack a pathway for a human to integrate their domain knowledge into the underlying machine learning algorithms. The utilization of such systems, for domains where decisions can have serious consequences (e.g. medical decision-making and crime analysis), requires the incorporation of human experts' domain knowledge. The challenge, however, is how to effectively incorporate domain expert knowledge with machine learning algorithms to develop effective models for better decision making. In crime analysis, the key challenge is to identify plausible linkages in unstructured crime reports for the hypothesis formulation. Crime analysts painstakingly perform time-consuming searches of many different structured and unstructured databases to collate these associations without any proper visualization. To tackle these challenges and aiming towards facilitating the crime analysis, in this paper, we examine unstructured crime reports through text mining to extract plausible associations. Specifically, we present associative questioning based searching model to elicit multi-level associations among crime entities. We coupled this model with partition clustering to develop an interactive, human-assisted knowledge discovery and data mining scheme. The proposed human-centered knowledge discovery and data mining scheme for crime text mining is able to extract plausible associations between crimes, identifying crime pattern, grouping similar crimes, eliciting co-offender network and suspect list based on spatial-temporal and behavioral similarity. These similarities are quantified through calculating Cosine, Jacquard, and Euclidean distances. Additionally, each suspect is also ranked by a similarity score in the plausible suspect list. These associations are then visualized through creating a two-dimensional re-configurable crime cluster space along with a bipartite knowledge graph. This proposed scheme also inspects the grand challenge of integrating effective human interaction with the machine learning algorithms through a visualization feedback loop. It allows the analyst to feed his/her domain knowledge including choosing of similarity functions for identifying associations, dynamic feature selection for interactive clustering of crimes and assigning weights to each component of the crime pattern to rank suspects for an unsolved crime. We demonstrate the proposed scheme through a case study using the Anonymized burglary dataset. The scheme is found to facilitate human reasoning and analytic discourse for intelligence analysis

Middlesex University Research Repository

Welcome to Sigmod 2019 - The 2019 ACM SIGMOD International Conference on the Management of Data!

Author: Ailamaki A. (Anastasia)
Boncz P.A. (Peter)
Manegold S. (Stefan)
Publication venue
Publication date: 30/06/2019
Field of study

CWI's Institutional Repository