7 research outputs found

    Classifying humans: the indirect reverse operativity of machine vision

    Get PDF
    Classifying is human. Classifying is also what machine vision technologies do. This article analyses the cybernetic loop between human and machine classification by examining artworks that depict instances of bias when machine vision is classifying humans and when humans classify visual datasets for machines. I propose the term ‘indirect reverse operativity’ – a concept built upon Ingrid Hoelzl’s and Remi Marie’s notion of ‘reverse operativity’ – to describe how classifying humans and machine classifiers operate in cybernetic information loops. Indirect reverse operativity is illustrated through two projects I have co-created: the Database of Machine Vision in Art, Games and Narrative and the artwork Suspicious Behavior. Through ‘artistic audits’ of selected artworks, a data analysis of how classification is represented in 500 creative works, and a reflection on my own artistic research in the Suspicious Behavior project, this article confronts and complicates assumptions of when and how bias is introduced into and propagates through machine vision classifiers. By examining cultural conceptions of machine vision bias which exemplify how humans operate machines and how machines operate humans through images, this article contributes fresh perspectives to the emerging field of critical dataset studies.publishedVersio

    Intersectional Identities and Machine Learning: Illuminating Language Biases in Twitter Algorithms

    Get PDF
    Intersectional analysis of social media data is rare. Social media data is ripe for identity and intersectionality analysis with wide accessibility and easy to parse text data yet provides a host of its own methodological challenges regarding the identification of identities. We aggregate Twitter data that was annotated by crowdsourcing for tags of “abusive,” “hateful,” or “spam” language. Using natural language prediction models, we predict the tweeter’s race and gender and investigate whether these tags for abuse, hate, and spam have a meaningful relationship with the gendered and racialized language predictions. Are certain gender and race groups more likely to be predicted if a tweet is labeled as abusive, hateful, or spam? The findings suggest that certain racial and intersectional groups are more likely to be associated with non-normal language identification. Language consistent with white identity is most likely to be considered within the norm and non-white racial groups are more often linked to hateful, abusive, or spam language

    IMPACT OF DATA COLLECTION ON ML MODELS: ANALYZING DIFFERENCES OF BIASES BETWEEN LOW- VS. HIGH-SKILLED ANNOTATORS

    Get PDF
    Labeled data is crucial for the success of machine learning-based artificial intelligence. However, companies often face a choice between collecting few annotations from high- or low-skilled annotators, possibly exhibiting different biases. This study investigates differences in biases between datasets labeled by said annotator groups and their impact on machine learning models. Therefore, we created high- and low-skilled annotated datasets measured the contained biases through entropy and trained different machine learning models to examine bias inheritance effects. Our findings on text sentiment annotations show both groups exhibit a considerable amount of bias in their annotations, although there is a significant difference regarding the error types commonly encountered. Models trained on biased annotations produce significantly different predictions, indicating bias propagation and tend to make more extreme errors than humans. As partial mitigation, we propose and show the efficiency of a hybrid approach where data is labeled by low-skilled and high-skilled workers
    corecore