2,521 research outputs found

    Gender Bias in Artificial Intelligence: Exploring the impacts of stereotypes on English-Vietnamese Machine Translation

    Get PDF
    Artificial Intelligence (AI) is increasingly influencing people\u27s opinion and behavior in daily life. Gender bias in AI, especially in Machine Translation has been a growing concern since the over-representation of men in the design of these technologies might gradually undermine decades of progress toward gender equality. Google Translate, the most up-to-date translation tool, has shown a strong tendency towards male defaults when translating between gender-neutral languages and English, in particular for fields typically associated to unbalanced gender distribution or stereotypes such as STEM (Science, Technology, Engineering and Mathematics) jobs as well as personality traits and looks. In my project, the main goal is to investigate gender bias in state-of-the art models for translating between a language with gender neutral pronouns (Vietnamese) and English. I am developing a collection of test sentences to probe a translation model for gender bias. First, I will use this test set to evaluate Google Translate and a trained machine translation model (Helsinki-NLP). Then, I will use a state-of-the-art Neural Network approach to train my own English-Vietnamese translation model (PhoBERT) on a standard translation dataset, and I will evaluate this model using my test sentences. I will assess the gender bias in the translation dataset and use a bias mitigation technique that adds a copy of each sentence containing any gendered words where the gender has been swapped from male to female and vice versa. I will then retrain my model and evaluate it again. Possible comparisons will be between Google Translate, Helsinki-NLP and the two versions of PhoBERT models before and after swapping out gendered words to see if there\u27s gender bias when doing the translation task from English-Vietnamese. The results of this project aim to demonstrate the need to augment current statistical translation tools with debiasing techniques. There is also the need to look further into using a bigger dataset with fewer stereotypes, which can be hard to achieve since language dataset always reflect its country\u27s social context

    Contextual Parameter Generation for Universal Neural Machine Translation

    Full text link
    We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation. Our approach requires no changes to the model architecture of a standard NMT system, but instead introduces a new component, the contextual parameter generator (CPG), that generates the parameters of the system (e.g., weights in a neural network). This parameter generator accepts source and target language embeddings as input, and generates the parameters for the encoder and the decoder, respectively. The rest of the model remains unchanged and is shared across all languages. We show how this simple modification enables the system to use monolingual data for training and also perform zero-shot translation. We further show it is able to surpass state-of-the-art performance for both the IWSLT-15 and IWSLT-17 datasets and that the learned language embeddings are able to uncover interesting relationships between languages.Comment: Published in the proceedings of Empirical Methods in Natural Language Processing (EMNLP), 201
    • …
    corecore