71 research outputs found

    Fairea: A Model Behaviour Mutation Approach to Benchmarking Bias Mitigation Methods

    Get PDF
    The increasingly wide uptake of Machine Learning (ML) has raised the significance of the problem of tackling bias (i.e., unfairness), making it a primary software engineering concern. In this paper, we introduce Fairea, a model behaviour mutation approach to benchmarking ML bias mitigation methods. We also report on a large-scale empirical study to test the effectiveness of 12 widely-studied bias mitigation methods. Our results reveal that, surprisingly, bias mitigation methods have a poor effectiveness in 49% of the cases. In particular, 15% of the mitigation cases have worse fairness-accuracy trade-offs than the baseline established by Fairea; 34% of the cases have a decrease in accuracy and an increase in bias. Fairea has been made publicly available for software engineers and researchers to evaluate their bias mitigation methods

    Fairness Testing: A Comprehensive Survey and Analysis of Trends

    Full text link
    Unfair behaviors of Machine Learning (ML) software have garnered increasing attention and concern among software engineers. To tackle this issue, extensive research has been dedicated to conducting fairness testing of ML software, and this paper offers a comprehensive survey of existing studies in this field. We collect 100 papers and organize them based on the testing workflow (i.e., how to test) and testing components (i.e., what to test). Furthermore, we analyze the research focus, trends, and promising directions in the realm of fairness testing. We also identify widely-adopted datasets and open-source tools for fairness testing

    In Crowd Veritas: Leveraging Human Intelligence To Fight Misinformation

    Get PDF
    The spread of online misinformation has important effects on the stability of democracy. The sheer size of digital content on the web and social media and the ability to immediately access and share it has made it difficult to perform timely fact-checking at scale. Truthfulness judgments are usually made by experts, like journalists for political statements. A different approach can be relying on a (non-expert) crowd of human judges to perform fact-checking. This leads to the following research question: can such human judges detect and objectively categorize online (mis)information? Several extensive studies based on crowdsourcing are performed to answer. Thousands of truthfulness judgments over two datasets are collected by recruiting a crowd of workers from crowdsourcing platforms and the expert judgments are compared with the crowd ones. The results obtained allow for concluding that the workers are indeed able to do such. There is a limited understanding of factors that influence worker participation in longitudinal studies across different crowdsourcing marketplaces. A large-scale survey aimed at understanding how these studies are performed using crowdsourcing is run across multiple platforms. The answers collected are analyzed from both a quantitative and a qualitative point of view. A list of recommendations for task requesters to conduct these studies effectively is provided together with a list of best practices for crowdsourcing platforms. Truthfulness is a subtle matter: statements can be just biased, imprecise, wrong, etc. and a unidimensional truth scale cannot account for such differences. The crowd workers are asked to judge seven different dimensions of truthfulness selected based on existing literature. The newly collected crowdsourced judgments show that the workers are indeed reliable when compared to an expert-provided gold standard. Cognitive biases are human processes that often help minimize the cost of making mistakes but keep assessors away from an objective judgment of information. A review of the cognitive biases which might manifest during the fact-checking process is presented together with a list of countermeasures that can be adopted. An exploratory study on the previously collected data set is thus performed. The findings are used to formulate hypotheses concerning which individual characteristics of statements or judges and what cognitive biases may affect crowd workers' truthfulness judgments. The findings suggest that crowd workers' degree of belief in science has an impact, that they generally overestimate truthfulness, and that their judgments are indeed affected by various cognitive biases. Automated fact-checking systems to combat misinformation spreading exist, however, their complexity usually makes them opaque to the end user, making it difficult to foster trust in the system. The E-BART model is introduced with the hope of making progress on this front. E-BART can provide a truthfulness prediction for a statement, and jointly generate a human-readable explanation. An extensive human evaluation of the impact of explanations generated by the model is conducted, showing that the explanations increase the human ability to spot misinformation. The whole set of data collected and analyzed in this thesis is publicly released to the research community at: https://doi.org/10.17605/OSF.IO/JR6VC.The spread of online misinformation has important effects on the stability of democracy. The information that is consumed every day influences human decision-making processes. The sheer size of digital content on the web and social media and the ability to immediately access and share it has made it difficult to perform timely fact-checking at scale. Indeed, fact-checking is a complex process that involves several activities. A long-term goal can be building a so-called human-in-the-loop system to cope with (mis)information by measuring truthfulness in real-time (e.g., as they appear on some social media, news outlets, and so on) using a combination of crowd-powered data, human intelligence, and machine learning techniques. In recent years, crowdsourcing has become a popular method for collecting to collect reliable truthfulness judgments in order to scale up and help study the manual fact-checking effort. Initially, this thesis investigates whether human judges can detect and objectively categorize online (mis)information and which is the environment that allows obtaining the best results. Then, the impact of cognitive biases on human assessors while judging information truthfulness is addressed. A categorization of cognitive biases is proposed together with countermeasures to combat their effects and a bias-aware judgment pipeline for fact-checking. Lastly, an approach able to predict information truthfulness and, at the same time, generate a natural language explanation supporting the prediction itself is proposed. The machine-generated explanations are evaluated to understand whether they are useful for the human assessors to better judge the truthfulness of information items. A collaborative process between systems, crowd workers, and expert fact checkers would provide a scalable and decentralized hybrid mechanism to cope with the increasing volume of online misinformation

    Characterization and Detection of Malicious Behavior on the Web

    Get PDF
    Web platforms enable unprecedented speed and ease in transmission of knowledge, and allow users to communicate and shape opinions. However, the safety, usability and reliability of these platforms is compromised by the prevalence of online malicious behavior -- for example 40% of users have experienced online harassment. This is present in the form of malicious users, such as trolls, sockpuppets and vandals, and misinformation, such as hoaxes and fraudulent reviews. This thesis presents research spanning two aspects of malicious behavior: characterization of their behavioral properties, and development of algorithms and models for detecting them. We characterize the behavior of malicious users and misinformation in terms of their activity, temporal frequency of actions, network connections to other entities, linguistic properties of how they write, and community feedback received from others. We find several striking characteristics of malicious behavior that are very distinct from those of benign behavior. For instance, we find that vandals and fraudulent reviewers are faster in their actions compared to benign editors and reviewers, respectively. Hoax articles are long pieces of plain text that are less coherent and created by more recent editors, compared to non-hoax articles. We find that sockpuppets are created that vary in their deceptiveness (i.e., whether they pretend to be different users) and their supportiveness (i.e., if they support arguments of other sockpuppets controlled by the same user). We create a suite of feature based and graph based algorithms to efficiently detect malicious from benign behavior. We first create the first vandal early warning system that accurately predicts vandals using very few edits. Next, based on the properties of Wikipedia articles, we develop a supervised machine learning classifier to predict whether an article is a hoax, and another that predicts whether a pair of accounts belongs to the same user, both with very high accuracy. We develop a graph-based decluttering algorithm that iteratively removes suspicious edges that malicious users use to masquerade as benign users, which outperforms existing graph algorithms to detect trolls. And finally, we develop an efficient graph-based algorithm to assess the fairness of all reviewers, reliability of all ratings, and goodness of all products, simultaneously, in a rating network, and incorporate penalties for suspicious behavior. Overall, in this thesis, we develop a suite of five models and algorithms to accurately identify and predict several distinct types of malicious behavior -- namely, vandals, hoaxes, sockpuppets, trolls and fraudulent reviewers -- in multiple web platforms. The analysis leading to the algorithms develops an interpretable understanding of malicious behavior on the web

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    WiFi-Based Human Activity Recognition Using Attention-Based BiLSTM

    Get PDF
    Recently, significant efforts have been made to explore human activity recognition (HAR) techniques that use information gathered by existing indoor wireless infrastructures through WiFi signals without demanding the monitored subject to carry a dedicated device. The key intuition is that different activities introduce different multi-paths in WiFi signals and generate different patterns in the time series of channel state information (CSI). In this paper, we propose and evaluate a full pipeline for a CSI-based human activity recognition framework for 12 activities in three different spatial environments using two deep learning models: ABiLSTM and CNN-ABiLSTM. Evaluation experiments have demonstrated that the proposed models outperform state-of-the-art models. Also, the experiments show that the proposed models can be applied to other environments with different configurations, albeit with some caveats. The proposed ABiLSTM model achieves an overall accuracy of 94.03%, 91.96%, and 92.59% across the 3 target environments. While the proposed CNN-ABiLSTM model reaches an accuracy of 98.54%, 94.25% and 95.09% across those same environments

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Applications

    Get PDF
    Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Quantifying & characterizing information diets of social media users

    Get PDF
    An increasing number of people are relying on online social media platforms like Twitter and Facebook to consume news and information about the world around them. This change has led to a paradigm shift in the way news and information is exchanged in our society – from traditional mass media to online social media. With the changing environment, it’s essential to study the information consumption of social media users and to audit how automated algorithms (like search and recommendation systems) are modifying the information that social media users consume. In this thesis, we fulfill this high-level goal with a two-fold approach. First, we propose the concept of information diets as the composition of information produced or consumed. Next, we quantify the diversity and bias in the information diets that social media users consume via the three main consumption channels on social media platforms: (a) word of mouth channels that users curate for themselves by creating social links, (b) recommendations that platform providers give to the users, and (c) search systems that users use to find interesting information on these platforms. We measure the information diets of social media users along three different dimensions of topics, geographic sources, and political perspectives. Our work is aimed at making social media users aware of the potential biases in their consumed diets, and at encouraging the development of novel mechanisms for mitigating the effects of these biases.Immer mehr Menschen verwenden soziale Medien, z.B. Twitter und Facebook, als Quelle für Nachrichten und Informationen aus ihrem Umfeld. Diese Entwicklung hat zu einem Paradigmenwechsel hinsichtlich der Art undWeise, wie Informationen und Nachrichten in unserer Gesellschaft ausgetauscht werden, geführt – weg von klassischen Massenmedien hin zu internetbasierten Sozialen Medien. Angesichts dieser veränderten (Informations-) Umwelt ist es von entscheidender Bedeutung, den Informationskonsum von Social Media-Nutzern zu untersuchen und zu prüfen, wie automatisierte Algorithmen (z.B. Such- und Empfehlungssysteme) die Informationen verändern, die Social Media- Nutzer aufnehmen. In der vorliegenden Arbeit wird diese Aufgabenstellung wie folgt angegangen: Zunächst wird das Konzept der “Information Diets” eingeführt, das eine Zusammensetzung aus produzierten und konsumierten Social Media-Inhalten darstellt. Als nächstes werden die Vielfalt und die Verzerrung (der sogenannte “Bias”) der “Information Diets” quantifiziert die Social Media-Nutzer über die drei hauptsächlichen Social Media- Kanäle konsumieren: (a) persönliche Empfehlungen und Auswahlen, die die Nutzer manuell pflegen und wodurch sie soziale Verbindungen (social links) erzeugen, (b) Empfehlungen, die dem Nutzer von der Social Media-Plattform bereitgestellt werden und (c) Suchsysteme der Plattform, die die Nutzer für ihren Informationsbedarf verwenden. Die “Information Diets” der Social Media-Nutzer werden hierbei anhand der drei Dimensionen Themen, geographische Lage und politische Ansichten gemessen. Diese Arbeit zielt zum einen darauf ab, Social Media-Nutzer auf die möglichen Verzerrungen in ihrer “Information Diet” aufmerksam zu machen. Des Weiteren soll diese Arbeit auch dazu anregen, neuartige Mechanismen und Algorithmen zu entwickeln, um solche Verzerrungen abzuschwächen
    • …
    corecore