1,233 research outputs found

    Interactive Decision making using Dissimilarity to visually represented Prototypes

    Get PDF
    ABSTRACT To make informed decisions, an expert has to reason with multidimensional, heterogeneous data and analysis results of these. Items in such datasets are typically represented by features. However, as argued in cognitive science, features do not yield an optimal space for human reasoning. In fact, humans tend to organize complex information in terms of prototypes or known cases rather than in absolute terms. When confronted with unknown data items, humans assess them in terms of similarity to these prototypical elements. Interestingly, an analogues similarity-to-prototype approach, where prototypes are taken from the data, has been successfully applied in machine learning. Combining such a machine learning approach with human prototypical reasoning in a Visual Analytics context requires to integrate similarity-based classification with interactive visualizations. To that end, the data prototypes should be visually represented to trigger direct associations to cases familiar to the domain experts. In this paper, we propose a set of highly interactive visualizations to explore data and classification results in terms of dissimilarities to visually represented prototypes. We argue that this approach not only supports human reasoning processes, but is also suitable to enhance understanding of heterogeneous data. The proposed framework is applied to a risk assessment case study in Forensic Psychiatry

    ProtoExplorer: Interpretable Forensic Analysis of Deepfake Videos using Prototype Exploration and Refinement

    Full text link
    In high-stakes settings, Machine Learning models that can provide predictions that are interpretable for humans are crucial. This is even more true with the advent of complex deep learning based models with a huge number of tunable parameters. Recently, prototype-based methods have emerged as a promising approach to make deep learning interpretable. We particularly focus on the analysis of deepfake videos in a forensics context. Although prototype-based methods have been introduced for the detection of deepfake videos, their use in real-world scenarios still presents major challenges, in that prototypes tend to be overly similar and interpretability varies between prototypes. This paper proposes a Visual Analytics process model for prototype learning, and, based on this, presents ProtoExplorer, a Visual Analytics system for the exploration and refinement of prototype-based deepfake detection models. ProtoExplorer offers tools for visualizing and temporally filtering prototype-based predictions when working with video data. It disentangles the complexity of working with spatio-temporal prototypes, facilitating their visualization. It further enables the refinement of models by interactively deleting and replacing prototypes with the aim to achieve more interpretable and less biased predictions while preserving detection accuracy. The system was designed with forensic experts and evaluated in a number of rounds based on both open-ended think aloud evaluation and interviews. These sessions have confirmed the strength of our prototype based exploration of deepfake videos while they provided the feedback needed to continuously improve the system.Comment: 15 pages, 6 figure

    Human-assisted self-supervised labeling of large data sets

    Get PDF
    There is a severe demand for, and shortage of, large accurately labeled datasets to train supervised computational intelligence (CI) algorithms in domains like unmanned aerial systems (UAS) and autonomous vehicles. This has hindered our ability to develop and deploy various computer vision algorithms in/across environments and niche domains for tasks like detection, localization, and tracking. Herein, I propose a new human-in-the-loop (HITL) based growing neural gas (GNG) algorithm to minimize human intervention during labeling large UAS data collections over a shared geospatial area. Specifically, I address human driven events like new class identification and mistake correction. I also address algorithm-centric operations like new pattern discovery and self-supervised labeling. Pattern discovery and identification through self-supervised labeling is made possible through open set recognition (OSR). Herein, I propose a classifier with the ability to say "I don't know" to identify outliers in the data and bootstrap deep learning (DL) models, specifically convolutional neural networks (CNNs), with the ability to classify on N+1 classes. The effectiveness of the algorithms are demonstrated using simulated realistic ray-traced low altitude UAS data from the Unreal Engine. The results show that it is possible to increase speed and reduce mental fatigue over hand labeling large image datasets.Includes bibliographical references

    Visual analytics and rendering for tunnel crack analysis

    Get PDF

    Social Identity Enactment Through Linguistic Style: Using Naturally Occurring Online Data to Study Behavioural Prototypicality

    Get PDF
    Social identity prototypes refer to the quintessential representation of a particular social identity; prototypes define and prescribe the characteristics, behaviours and attitudes of a particular group, as distinguished from other groups (Hogg, 2001). For the most part, identity prototypicality is studied using self-reported methods used to assess perceptions of the prototypicality of self and others. However, in this thesis we provide behavioural evidence to demonstrate how linguistic style data can be used to measure identity-prototypical behaviour in real world contexts. Combining naturally-occurring online data with experimental data, the first chapter demonstrates that individuals behave in an identity-prototypical way regardless of the context in which they are communicating. Further, we show that this identity-prototypical style of communication is robust to topic, demographics, personality and platform, and moreover that the same identity-prototypical communication style can be detected in experimentally controlled conditions. In the second chapter, we demonstrate the small but statistically significant link between identity-prototypical communication and influence in real-world forum data. This finding provides insight into how group members respond to other ingroup members based on their prototypical communication style in real-world situations. Finally, in the third chapter, we use the group prototypical behaviour observed in naturally occurring online forum data to construct a typology of social identities, demonstrating the existence of five different types of social identity in line with the research of Deaux et al. (1995). We also demonstrate that it is possible to use this measurement of behavioural prototypicality to observe identity change over time. Using eight years’ worth of forum data, we illustrate the slow movement of the transgender identity from being a stigmatised identity in 2012, to shifting towards a collective action identity in 2019. In sum, the findings outlined in this thesis provide evidence to support the idea that it is possible to use machine learning algorithms and naturally occurring online data to study behavioural prototypicality in real world environments. Moreover, this methodology enables us to study identities ‘in the wild’ thus transcending the limitations associated with using self-reported methodologies or experimental approaches to study how individuals express and enact their group memberships. Further, we also demonstrate the value in using naturally-occurring online behavioural data to test and extend the key components of social identity theory.Engineering and Physical Sciences Research Council (EPSRC)Engineering and Physical Sciences Research Council (EPSRC

    Interpretable Models Capable of Handling Systematic Missingness in Imbalanced Classes and Heterogeneous Datasets

    Get PDF
    Application of interpretable machine learning techniques on medical datasets facilitate early and fast diagnoses, along with getting deeper insight into the data. Furthermore, the transparency of these models increase trust among application domain experts. Medical datasets face common issues such as heterogeneous measurements, imbalanced classes with limited sample size, and missing data, which hinder the straightforward application of machine learning techniques. In this paper we present a family of prototype-based (PB) interpretable models which are capable of handling these issues. The models introduced in this contribution show comparable or superior performance to alternative techniques applicable in such situations. However, unlike ensemble based models, which have to compromise on easy interpretation, the PB models here do not. Moreover we propose a strategy of harnessing the power of ensembles while maintaining the intrinsic interpretability of the PB models, by averaging the model parameter manifolds. All the models were evaluated on a synthetic (publicly available dataset) in addition to detailed analyses of two real-world medical datasets (one publicly available). Results indicated that the models and strategies we introduced addressed the challenges of real-world medical data, while remaining computationally inexpensive and transparent, as well as similar or superior in performance compared to their alternatives
    • …
    corecore