25 research outputs found

    Human-AI Interaction in the Presence of Ambiguity: From Deliberation-based Labeling to Ambiguity-aware AI

    Get PDF
    Ambiguity, the quality of being open to more than one interpretation, permeates our lives. It comes in different forms including linguistic and visual ambiguity, arises for various reasons and gives rise to disagreements among human observers that can be hard or impossible to resolve. As artificial intelligence (AI) is increasingly infused into complex domains of human decision making it is crucial that the underlying AI mechanisms also support a notion of ambiguity. Yet, existing AI approaches typically assume that there is a single correct answer for any given input, lacking mechanisms to incorporate diverse human perspectives in various parts of the AI pipeline, including data labeling, model development and user interface design. This dissertation aims to shed light on the question of how humans and AI can be effective partners in the presence of ambiguous problems. To address this question, we begin by studying group deliberation as a tool to detect and analyze ambiguous cases in data labeling. We present three case studies that investigate group deliberation in the context of different labeling tasks, data modalities and types of human labeling expertise. First, we present CrowdDeliberation, an online platform for synchronous group deliberation in novice crowd work, and show how worker deliberation affects resolvability and accuracy in text classification tasks of varying subjectivity. We then translate our findings to the expert domain of medical image classification to demonstrate how imposing additional structure on deliberation arguments can improve the efficiency of the deliberation process without compromising its reliability. Finally, we present CrowdEEG, an online platform for collaborative annotation and deliberation of medical time series data, implementing an asynchronous and highly structured deliberation process. Our findings from an observational study with 36 sleep health professionals help explain how disagreements arise and when they can be resolved through group deliberation. Beyond investigating group deliberation within data labeling, we also demonstrate how the resulting deliberation data can be used to support both human and artificial intelligence. To this end, we first present results from a controlled experiment with ten medical generalists, suggesting that reading deliberation data from medical specialists significantly improves generalists' comprehension and diagnostic accuracy on difficult patient cases. Second, we leverage deliberation data to simulate and investigate AI assistants that not only highlight ambiguous cases, but also explain the underlying sources of ambiguity to end users in human-interpretable terms. We provide evidence suggesting that this form of ambiguity-aware AI can help end users to triage and trust AI-provided data classifications. We conclude by outlining the main contributions of this dissertation and directions for future research

    Game Theory Solutions in Sensor-Based Human Activity Recognition: A Review

    Full text link
    The Human Activity Recognition (HAR) tasks automatically identify human activities using the sensor data, which has numerous applications in healthcare, sports, security, and human-computer interaction. Despite significant advances in HAR, critical challenges still exist. Game theory has emerged as a promising solution to address these challenges in machine learning problems including HAR. However, there is a lack of research work on applying game theory solutions to the HAR problems. This review paper explores the potential of game theory as a solution for HAR tasks, and bridges the gap between game theory and HAR research work by suggesting novel game-theoretic approaches for HAR problems. The contributions of this work include exploring how game theory can improve the accuracy and robustness of HAR models, investigating how game-theoretic concepts can optimize recognition algorithms, and discussing the game-theoretic approaches against the existing HAR methods. The objective is to provide insights into the potential of game theory as a solution for sensor-based HAR, and contribute to develop a more accurate and efficient recognition system in the future research directions

    Intelligent Vehicle Development through Scalable Data Collection Processes and Simulation

    Get PDF
    With current automotive trends in both vehicle electrification and intelligent features such as Advanced Driver-Assistance Systems (ADAS), there is a significant need for a modern vehicle development process which makes use of big data. In the following report, a scalable, phone-based, driving data collection system is developed and applied to powertrain design through a motivating example. Initial project efforts are directed towards the development of both a data collection platform and a system which is capable of interpreting and storing the collected drive data. The developed UWAFT Innovation Platform (UIP) and Monocular Vision Pipeline (MVP) are a functional system which attempt to precipitate crowdsourcing of data collection through a low system cost and open software approach. In an application of this platform data is collected by a test driver for a month in the form of a pilot project, with results evaluated in terms of geographical coverage and with the development of a statistical event profile detailing events of simulation value. The data collected contains over 6 million data points, and over 7.45hrs of driving. In evaluating MVP performance, the You Only Look Once (YOLO) multi-object detector and Markov Decision Process (MDP) multi-object tracker are implemented, with results demonstrating robustness to occlusions and the capability to detect far-away pedestrians and vehicles. With this data collection system functional, and the data from the pilot project experiment, a powertrain simulation environment for University of Waterloo Alternative Fuels Team (UWAFT) is developed. Given the Advanced Vehicle Technology Competition (AVTC) process, it is crucial to continue to explore and design novel powertrain configurations in an environment which is conducive to flexible configuration and with acceptable ease-of-use. Of the environments available, Simscape is selected and a novel Metal-Air Extended Range Electric Vehicle (MA-EREV) powertrain model is developed as a validation of the simulation tool. Upon validating simulated VTS against existing work, results are consistent excluding a 15% reduction in estimated range and a 41% decrease in 50-70 mph acceleration time. To provide an example of the data-driven approach, a winter-driving scenario where the pilot project driver demonstrated slipping is imported as a drive cycle in the MA-EREV model and simulated in an experiment. In analyzing results traction performance of the MA-EREV is evaluated. The MA-EREV weighs 677kg more than the pilot project vehicle, and has increased starting torque due to electrification. In analyzing the results of this scenario replication, the longitudinal slip on the tires reached a maximum of 41% slip (94% of available traction) during stopping and 84% slip (55% of available traction) during acceleration from stop, with more slipping overall during acceleration than stopping. This result indicates that the MA-EREV may need additional traction considerations for safe performance in winter conditions

    Learning from disagreement: a survey

    Get PDF
    Many tasks in Natural Language Processing (nlp) and Computer Vision (cv) offer evidence that humans disagree, from objective tasks such as part-of-speech tagging to more subjective tasks such as classifying an image or deciding whether a proposition follows from certain premises. While most learning in artificial intelligence (ai) still relies on the assumption that a single (gold) interpretation exists for each item, a growing body of research aims to develop learning methods that do not rely on this assumption. In this survey, we review the evidence for disagreements on nlp and cv tasks, focusing on tasks for which substantial datasets containing this information have been created. We discuss the most popular approaches to training models from datasets containing multiple judgments potentially in disagreement. We systematically compare these different approaches by training them with each of the available datasets, considering several ways to evaluate the resulting models. Finally, we discuss the results in depth, focusing on four key research questions, and assess how the type of evaluation and the characteristics of a dataset determine the answers to these questions. Our results suggest, first of all, that even if we abandon the assumption of a gold standard, it is still essential to reach a consensus on how to evaluate models. This is because the relative performance of the various training methods is critically affected by the chosen form of evaluation. Secondly, we observed a strong dataset effect. With substantial datasets, providing many judgments by high-quality coders for each item, training directly with soft labels achieved better results than training from aggregated or even gold labels. This result holds for both hard and soft evaluation. But when the above conditions do not hold, leveraging both gold and soft labels generally achieved the best results in the hard evaluation. All datasets and models employed in this paper are freely available as supplementary materials

    Biomedical Information Extraction Pipelines for Public Health in the Age of Deep Learning

    Get PDF
    abstract: Unstructured texts containing biomedical information from sources such as electronic health records, scientific literature, discussion forums, and social media offer an opportunity to extract information for a wide range of applications in biomedical informatics. Building scalable and efficient pipelines for natural language processing and extraction of biomedical information plays an important role in the implementation and adoption of applications in areas such as public health. Advancements in machine learning and deep learning techniques have enabled rapid development of such pipelines. This dissertation presents entity extraction pipelines for two public health applications: virus phylogeography and pharmacovigilance. For virus phylogeography, geographical locations are extracted from biomedical scientific texts for metadata enrichment in the GenBank database containing 2.9 million virus nucleotide sequences. For pharmacovigilance, tools are developed to extract adverse drug reactions from social media posts to open avenues for post-market drug surveillance from non-traditional sources. Across these pipelines, high variance is observed in extraction performance among the entities of interest while using state-of-the-art neural network architectures. To explain the variation, linguistic measures are proposed to serve as indicators for entity extraction performance and to provide deeper insight into the domain complexity and the challenges associated with entity extraction. For both the phylogeography and pharmacovigilance pipelines presented in this work the annotated datasets and applications are open source and freely available to the public to foster further research in public health.Dissertation/ThesisDoctoral Dissertation Biomedical Informatics 201

    Crowdsourced intuitive visual design feedback

    Get PDF
    For many people images are a medium preferable to text and yet, with the exception of star ratings, most formats for conventional computer mediated feedback focus on text. This thesis develops a new method of crowd feedback for designers based on images. Visual summaries are generated from a crowd’s feedback images chosen in response to a design. The summaries provide the designer with impressionistic and inspiring visual feedback. The thesis sets out the motivation for this new method, describes the development of perceptually organised image sets and a summarisation algorithm to implement it. Evaluation studies are reported which, through a mixed methods approach, provide evidence of the validity and potential of the new image-based feedback method. It is concluded that the visual feedback method would be more appealing than text for that section of the population who may be of a visual cognitive style. Indeed the evaluation studies are evidence that such users believe images are as good as text when communicating their emotional reaction about a design. Designer participants reported being inspired by the visual feedback where, comparably, they were not inspired by text. They also reported that the feedback can represent the perceived mood in their designs, and that they would be enthusiastic users of a service offering this new form of visual design feedback

    Selected Papers from the 5th International Electronic Conference on Sensors and Applications

    Get PDF
    This Special Issue comprises selected papers from the proceedings of the 5th International Electronic Conference on Sensors and Applications, held on 15–30 November 2018, on sciforum.net, an online platform for hosting scholarly e-conferences and discussion groups. In this 5th edition of the electronic conference, contributors were invited to provide papers and presentations from the field of sensors and applications at large, resulting in a wide variety of excellent submissions and topic areas. Papers which attracted the most interest on the web or that provided a particularly innovative contribution were selected for publication in this collection. These peer-reviewed papers are published with the aim of rapid and wide dissemination of research results, developments, and applications. We hope this conference series will grow rapidly in the future and become recognized as a new way and venue by which to (electronically) present new developments related to the field of sensors and their applications

    Design and semantics of form and movement:DeSForM 2010, November 3-5, 2010, Lucerne

    Get PDF
    corecore