8,869 research outputs found

    She? The Role of Perceived Agent Gender in Social Media Customer Service

    Get PDF
    This paper investigated the role of perceived agent gender in customer behavior using a unique dataset from Southwest Airlines\u27 Twitter account. We inferred agent gender based on the first names provided by agents when responding to customers. We measured customer behavior using three outcomes: whether a customer decided to continue the service conversation upon receiving an agent’s initial response as well as the valence and arousal levels in their second tweet if the customer chose to continue the interaction. Our identification strategy relied on the Backdoor Criterion and hinged on the assumption that customer service requests are assigned to the next available agent, independent of agent gender. The findings revealed that customers were more likely to continue interactions with female agents than male agents and they were more negative in valence but less intense in arousal with the former group than with the latter

    Assessing Bias Removal from Word Embeddings

    Get PDF
    As machine learning becomes more influential in everyday life, we must begin addressing potential shortcomings. A current problem area is word embeddings, frameworks that transform words into numbers, allowing the algorithmic analysis of language. Without a method for filtering implicit human bias from the documents used to create these embeddings, they contain and propagate stereotypes. Previous work has shown that one commonly used and distributed word embedding model trained on articles from Google News contained prejudice between gender and occupation (Bolukbasi 2016). While unsurprising, the use of biased data in machine learning models only serves to amplify the problem. Although attempts have been made to remove or reduce these biases, a true solution has yet to be found. Hiring models, tools trained to identify well-fitting job candidates, show the impact of gender stereotypes on occupations. Companies like Amazon have abandoned these systems due to flawed decision-making, even after years of development. I investigated whether the technique of word embedding adjustments from Bolukbasi 2016 made a difference in the results of an emulated hiring model. After collecting and cleaning resumes and job postings, I created a model that predicted whether candidates were a good fit for a job based on a training set of resumes from those already hired. To assess differences, I built the same model with different word vectors, including the original and adjusted word2vec embedding. Results were expected to show some form of bias on classification. I conclude with potential improvements and additional work being done

    The Mediation Effect of Trusting Beliefs on the Relationship Between Expectation-Confirmation and Satisfaction with the Usage of Online Product Recommendation

    Full text link
    Online Product Recommendations (OPRs) are increasingly available to onlinecustomers as a value-added self-service in evaluating and choosing a product.Research has highlighted several advantages that customers can gain from usingOPRs. However, the realization of these advantages depends on whether and towhat extent customers embrace and fully utilise them. The relatively low OPR USAgerate indicates that customers have not yet developed trust in OPRs’ performance.Past studies also have established that satisfaction is a valid measure of systemperformance and a consistent significant determinant of users’ continuous systemusage. Therefore, this study aimed to examine the mediation effect of trustingbeliefs on the relationship between expectation-confirmation and satisfaction. Theproposed research model is tested using data collected via an online survey from626 existing users of OPRs. The empirical results revealed that social-psychologicalbeliefs (perceived confirmation and trust) are significant contributors to customersatisfaction with OPRs. Additionally, trusting beliefs partially mediate the impactof perceived confirmation on customer satisfaction. Moreover, this study validatesthe extensions of the interpersonal trust construct to trust in OPRs and examinesthe nomological validity of trust in terms of competence, benevolence, andintegrity. The findings provide a number of theoretical and practical implications.&nbsp

    CausaLM: Causal Model Explanation Through Counterfactual Language Models

    Full text link
    Understanding predictions made by deep neural networks is notoriously difficult, but also crucial to their dissemination. As all ML-based methods, they are as good as their training data, and can also capture unwanted biases. While there are tools that can help understand whether such biases exist, they do not distinguish between correlation and causation, and might be ill-suited for text-based models and for reasoning about high level language concepts. A key problem of estimating the causal effect of a concept of interest on a given model is that this estimation requires the generation of counterfactual examples, which is challenging with existing generation technology. To bridge that gap, we propose CausaLM, a framework for producing causal model explanations using counterfactual language representation models. Our approach is based on fine-tuning of deep contextualized embedding models with auxiliary adversarial tasks derived from the causal graph of the problem. Concretely, we show that by carefully choosing auxiliary adversarial pre-training tasks, language representation models such as BERT can effectively learn a counterfactual representation for a given concept of interest, and be used to estimate its true causal effect on model performance. A byproduct of our method is a language representation model that is unaffected by the tested concept, which can be useful in mitigating unwanted bias ingrained in the data.Comment: Our code and data are available at: https://amirfeder.github.io/CausaLM/ Under review for the Computational Linguistics journa

    Automatic Detection of Online Jihadist Hate Speech

    Full text link
    We have developed a system that automatically detects online jihadist hate speech with over 80% accuracy, by using techniques from Natural Language Processing and Machine Learning. The system is trained on a corpus of 45,000 subversive Twitter messages collected from October 2014 to December 2016. We present a qualitative and quantitative analysis of the jihadist rhetoric in the corpus, examine the network of Twitter users, outline the technical procedure used to train the system, and discuss examples of use.Comment: 31 page

    Bias in Language Models: Beyond Trick Tests and Toward RUTEd Evaluation

    Full text link
    Bias benchmarks are a popular method for studying the negative impacts of bias in LLMs, yet there has been little empirical investigation of whether these benchmarks are actually indicative of how real world harm may manifest in the real world. In this work, we study the correspondence between such decontextualized "trick tests" and evaluations that are more grounded in Realistic Use and Tangible {Effects (i.e. RUTEd evaluations). We explore this correlation in the context of gender-occupation bias--a popular genre of bias evaluation. We compare three de-contextualized evaluations adapted from the current literature to three analogous RUTEd evaluations applied to long-form content generation. We conduct each evaluation for seven instruction-tuned LLMs. For the RUTEd evaluations, we conduct repeated trials of three text generation tasks: children's bedtime stories, user personas, and English language learning exercises. We found no correspondence between trick tests and RUTEd evaluations. Specifically, selecting the least biased model based on the de-contextualized results coincides with selecting the model with the best performance on RUTEd evaluations only as often as random chance. We conclude that evaluations that are not based in realistic use are likely insufficient to mitigate and assess bias and real-world harms
    • …
    corecore