14 research outputs found

    The unbearable (technical) unreliability of automated facial emotion recognition

    Get PDF
    Emotion recognition, and in particular acial emotion recognition (FER), is among the most controversial applications of machine learning, not least because of its ethical implications for human subjects. In this article, we address the controversial conjecture that machines can read emotions from our facial expressions by asking whether this task can be performed reliably. This means, rather than considering the potential harms or scientific soundness of facial emotion recognition systems, focusing on the reliability of the ground truths used to develop emotion recognition systems, assessing how well different human observers agree on the emotions they detect in subjects' faces. Additionally, we discuss the extent to which sharing context can help observers agree on the emotions they perceive on subjects' faces. Briefly, we demonstrate that when large and heterogeneous samples of observers are involved, the task of emotion detection from static images crumbles into inconsistency. We thus reveal that any endeavour to understand human behaviour from large sets of labelled patterns is over-ambitious, even if it were technically feasible. We conclude that we cannot speak of actual accuracy for facial emotion recognition systems for any practical purposes

    The Impact of Gender and Personality in Human-AI Teaming: The Case of Collaborative Question Answering

    Get PDF
    This paper discusses the results of an exploratory study aimed at investigating the impact of conversational agents (CAs) and specifically their agential characteristics on collaborative decision-making processes. The study involved 29 participants divided into 8 small teams engaged in a question-and-answer trivia-style game with the support of a text-based CA, characterized by two independent binary variables: personality (gentle and cooperative vs blunt and uncooperative) and gender (female vs male). A semi-structured group interview was conducted at the end of the experimental sessions to investigate the perceived utility and level of satisfaction with the CAs. Our results show that when users interact with a gentle and cooperative CA, their user satisfaction is higher. Furthermore, female CAs are perceived as more useful and satisfying to interact with than male CAs. We show that group performance improves through interaction with the CAs, confirming that a stereotype favoring the female with a gentle and cooperative personality combination exists in regard to perceived satisfaction, even though this does not lead to greater perceived utility. Our study extends the current debate about the possible correlation between CA characteristics and human acceptance and suggests future research to investigate the role of gender bias and related biases in human-AI teaming

    Painting the black box white: experimental findings from applying XAI to an ECG reading setting

    Full text link
    The shift from symbolic AI systems to black-box, sub-symbolic, and statistical ones has motivated a rapid increase in the interest toward explainable AI (XAI), i.e. approaches to make black-box AI systems explainable to human decision makers with the aim of making these systems more acceptable and more usable tools and supports. However, we make the point that, rather than always making black boxes transparent, these approaches are at risk of \emph{painting the black boxes white}, thus failing to provide a level of transparency that would increase the system's usability and comprehensibility; or, even, at risk of generating new errors, in what we termed the \emph{white-box paradox}. To address these usability-related issues, in this work we focus on the cognitive dimension of users' perception of explanations and XAI systems. To this aim, we designed and conducted a questionnaire-based experiment by which we involved 44 cardiology residents and specialists in an AI-supported ECG reading task. In doing so, we investigated different research questions concerning the relationship between users' characteristics (e.g. expertise) and their perception of AI and XAI systems, including their trust, the perceived explanations' quality and their tendency to defer the decision process to automation (i.e. technology dominance), as well as the mutual relationships among these different dimensions. Our findings provide a contribution to the evaluation of AI-based support systems from a Human-AI interaction-oriented perspective and lay the ground for further investigation of XAI and its effects on decision making and user experience.Comment: 15 pages, 7 figure

    The multicenter European Biological Variation Study (EuBIVAS): a new glance provided by the Principal Component Analysis (PCA), a machine learning unsupervised algorithms, based on the basic metabolic panel linked measurands

    Get PDF
    Abstract Objectives The European Biological Variation Study (EuBIVAS), which includes 91 healthy volunteers from five European countries, estimated high-quality biological variation (BV) data for several measurands. Previous EuBIVAS papers reported no significant differences among laboratories/population; however, they were focused on specific set of measurands, without a comprehensive general look. The aim of this paper is to evaluate the homogeneity of EuBIVAS data considering multivariate information applying the Principal Component Analysis (PCA), a machine learning unsupervised algorithm. Methods The EuBIVAS data for 13 basic metabolic panel linked measurands (glucose, albumin, total protein, electrolytes, urea, total bilirubin, creatinine, phosphatase alkaline, aminotransferases), age, sex, menopause, body mass index (BMI), country, alcohol, smoking habits, and physical activity, have been used to generate three databases developed using the traditional univariate and the multivariate Elliptic Envelope approaches to detect outliers, and different missing-value imputations. Two matrix of data for each database, reporting both mean values, and "within-person BV" (CVP) values for any measurand/subject, were analyzed using PCA. Results A clear clustering between males and females mean values has been identified, where the menopausal females are closer to the males. Data interpretations for the three databases are similar. No significant differences for both mean and CVPs values, for countries, alcohol, smoking habits, BMI and physical activity, have been found. Conclusions The absence of meaningful differences among countries confirms the EuBIVAS sample homogeneity and that the obtained data are widely applicable to deliver APS. Our data suggest that the use of PCA and the multivariate approach may be used to detect outliers, although further studies are required

    Comparing Handcrafted Features and Deep Neural Representations for Domain Generalization in Human Activity Recognition

    Get PDF
    Human Activity Recognition (HAR) has been studied extensively, yet current approaches are not capable of generalizing across different domains (i.e., subjects, devices, or datasets) with acceptable performance. This lack of generalization hinders the applicability of these models in real-world environments. As deep neural networks are becoming increasingly popular in recent work, there is a need for an explicit comparison between handcrafted and deep representations in Out-of-Distribution (OOD) settings. This paper compares both approaches in multiple domains using homogenized public datasets. First, we compare several metrics to validate three different OOD settings. In our main experiments, we then verify that even though deep learning initially outperforms models with handcrafted features, the situation is reversed as the distance from the training distribution increases. These findings support the hypothesis that handcrafted features may generalize better across specific domains.publishe

    Prediction of Choice from Competing Mechanosensory and Choice-Memory Cues during Active Tactile Decision Making

    Get PDF
    Perceptual decision making is an active process where animals move their sense organs to extract task-relevant information. To investigate how the brain translates sensory input into decisions during active sensation, we developed a mouse active touch task where the mechanosensory input can be precisely measured and that challenges animals to use multiple mechanosensory cues. Male mice were trained to localize a pole using a single whisker and to report their decision by selecting one of three choices. Using high-speed imaging and machine vision, we estimated whisker–object mechanical forces at millisecond resolution. Mice solved the task by a sensory-motor strategy where both the strength and direction of whisker bending were informative cues to pole location. We found competing influences of immediate sensory input and choice memory on mouse choice. On correct trials, choice could be predicted from the direction and strength of whisker bending, but not from previous choice. In contrast, on error trials, choice could be predicted from previous choice but not from whisker bending. This study shows that animal choices during active tactile decision making can be predicted from mechanosensory and choice-memory signals, and provides a new task well suited for the future study of the neural basis of active perceptual decisions

    Toward a Perspectivist Turn in Ground Truthing for Predictive Computing

    No full text
    Most current Artificial Intelligence applications are based on supervised Machine Learning (ML), which ultimately grounds on data annotated by small teams of experts or large ensemble of volunteers. The annotation process is often performed in terms of a majority vote, however this has been proved to be often problematic by recent evaluation studies. In this article, we describe and advocate for a different paradigm, which we call perspectivism: this counters the removal of disagreement and, consequently, the assumption of correctness of traditionally aggregated gold-standard datasets, and proposes the adoption of methods that preserve divergence of opinions and integrate multiple perspectives in the ground truthing process of ML development. Drawing on previous works which inspired it, mainly from the crowdsourcing and multi-rater labeling settings, we survey the state-of-the-art and describe the potential of our proposal for not only the more subjective tasks (e.g. those related to human language) but also those tasks commonly understood as objective (e.g. medical decision making). We present the main benefits of adopting a perspectivist stance in ML, as well as possible disadvantages, and various ways in which such a stance can be implemented in practice. Finally, we share a set of recommendations and outline a research agenda to advance the perspectivist stance in ML

    Belief Functions and Rough Sets: Survey and New Insights

    No full text
    International audienceRough set theory and belief function theory, two popular mathematical frameworks for uncertainty representation, have been widely applied in different settings and contexts. Despite different origins and mathematical foundations, the fundamental concepts of the two formalisms (i.e., approximations in rough set theory, belief and plausibility functions in belief function theory) are closely related. In this survey article, we review the most relevant contributions studying the links between these two uncertainty representation formalisms. In particular, we discuss the theoretical relationships connecting the two approaches, as well as their applications in knowledge representation and machine learning. Special attention is paid to the combined use of these formalisms as a way of dealing with imprecise and uncertain information. The aim of this work is, thus, to provide a focused picture of these two important fields, discuss some known results and point to relevant future research directions

    Prediction of ICU admission for COVID-19 patients: a Machine Learning approach based on Complete Blood Count data

    No full text
    This is the dataset associated with the publication titled: "Prediction of ICU admission for COVID-19 patients: a Machine Learning approach based on Complete Blood Count data" and accepted for publication at Computer-Based Medical Systems (CBMS) 2021. The dataset encompasses 4995 unique observations and 22 features (20 features from the Complete Blood Count and 2 demographics feature), along with two possible targets: ICU admission (column "Severity") and death (column "Dead). All data is de-identified, an anonymous ID field is available to associate patients with observations. As regards the features: Sex is encoded as a binary variable where 1 represents "Male" and 0 represents "Female"; simiarly also the two target variables are binary encoded, and they both refer to a 5 day horizon (that is, the value of the target variable is equal to 1 if, within 5 days from the observation date the adverse event occurred). Full information about dataset features, processing methods, et cetera is available in the accompanying paper. Fon any question or comment please contact: [email protected]
    corecore