5 research outputs found

    A Debiasing Variational Autoencoder for Deforestation Mapping

    Get PDF
    Deep Learning (DL) algorithms provide numerous benefits in different applications, and they usually yield successful results in scenarios with enough labeled training data and similar class proportions. However, the labeling procedure is a cost and time-consuming task. Furthermore, numerous real-world classification problems present a high level of class imbalance, as the number of samples from the classes of interest differ significantly. In various cases, such conditions tend to promote the creation of biased systems, which negatively impact their performance. Designing unbiased systems has been an active research topic, and recently some DL-based techniques have demonstrated encouraging results in that regard. In this work, we introduce an extension of the Debiasing Variational Autoencoder (DB-VAE) for semantic segmentation. The approach is based on an end-to-end DL scheme and employs the learned latent variables to adjust the individual sampling probabilities of data points during the training process. For that purpose, we adapted the original DB-VAE architecture for dense labeling in the context of deforestation mapping. Experiments were carried out on a region of the Brazilian Amazon, using Sentinel-2 data and the deforestation map from the PRODES project. The reported results show that the proposed DB-VAE approach is able to learn and identify under-represented samples, and select them more frequently in the training batches, consequently delivering superior classification metrics

    Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure

    Get PDF
    Recent research has highlighted the vulnerabilities of modern machine learning based systems to bias, especially for segments of society that are under-represented in training data. In this work, we develop a novel, tunable algorithm for mitigating the hidden, and potentially unknown, biases within training data. Our algorithm fuses the original learning task with a variational autoencoder to learn the latent structure within the dataset and then adaptively uses the learned latent distributions to re-weight the importance of certain data points while training. While our method is generalizable across various data modalities and learning tasks, in this work we use our algorithm to address the issue of racial and gender bias in facial detection systems. We evaluate our algorithm on the Pilot Parliaments Benchmark (PPB), a dataset specifically designed to evaluate biases in computer vision systems, and demonstrate increased overall performance as well as decreased categorical bias with our debiasing approach

    Fairness and Privacy in Machine Learning Algorithms

    Get PDF
    Roughly 2.5 quintillion bytes of data is generated daily in this digital era. Manual processing of such huge amounts of data to extract useful information is nearly impossible but with the widespread use of machine learning algorithms and their ability to process enormous data in a fast, cost-effective, and scalable way has proven to be a preferred choice to glean useful insights and solve business problems in many domains. With this widespread use of machine learning algorithms there has always been concerns about the ethical issues that may arise from the use of this modern technology. While achieving high accuracies, accomplishing trustable and fair machine learning has been challenging. Maintaining data fairness and privacy is one of the top challenges faced by the industry as organizations employ various machine learning algorithms to automatically make decisions based on trends from previously collected data. Protected group or attribute refers to the group of individuals towards whom the system has some preconceived reservations and hence is discriminatory. Discrimination is the unjustified treatment towards a particular category of people based on their race, age, gender, religion, sexual orientation, or disability. If we use the data with preconceived reservation or inbuilt discrimination towards certain group, then the model trained on such data will also be discriminatory towards these specific individuals

    A Study of Gender Bias in Face Presentation Attack and Its Mitigation

    No full text
    In biometric systems, the process of identifying or verifying people using facial data must be highly accurate to ensure a high level of security and credibility. Many researchers investigated the fairness of face recognition systems and reported demographic bias. However, there was not much study on face presentation attack detection technology (PAD) in terms of bias. This research sheds light on bias in face spoofing detection by implementing two phases. First, two CNN (convolutional neural network)-based presentation attack detection models, ResNet50 and VGG16 were used to evaluate the fairness of detecting imposer attacks on the basis of gender. In addition, different sizes of Spoof in the Wild (SiW) testing and training data were used in the first phase to study the effect of gender distribution on the models’ performance. Second, the debiasing variational autoencoder (DB-VAE) (Amini, A., et al., Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure) was applied in combination with VGG16 to assess its ability to mitigate bias in presentation attack detection. Our experiments exposed minor gender bias in CNN-based presentation attack detection methods. In addition, it was proven that imbalance in training and testing data does not necessarily lead to gender bias in the model’s performance. Results proved that the DB-VAE approach (Amini, A., et al., Uncovering and Mitigating Algorithmic Bias through Learned Latent Structure) succeeded in mitigating bias in detecting spoof faces
    corecore