Search CORE

61 research outputs found

On the differences between BERT and MT encoder spaces and how to address them in translation tasks

Author: Celikkanat Hande
Creutz Mathias
Tiedemann Jörg
Vazquez Raul
Publication venue: The Association for Computational Linguistics
Publication date: 01/08/2021
Field of study

Various studies show that pretrained language models such as BERT cannot straightforwardly replace encoders in neural machine translation despite their enormous success in other tasks. This is even more astonishing considering the similarities between the architectures. This paper sheds some light on the embedding spaces they create, using average cosine similarity, contextuality metrics and measures for representational similarity for comparison, revealing that BERT and NMT encoder representations look significantly different from one another. In order to address this issue, we propose a supervised transformation from one into the other using explicit alignment and fine-tuning. Our results demonstrate the need for such a transformation to improve the applicability of BERT in MT.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Tracking the Traces of Passivization and Negation in Contextualized Representations

Author: Apidianaki Marianna
Celikkanat Hande
Tiedemann Jörg
Virpioja Sami
Publication venue: The Association for Computational Linguistics
Publication date: 01/01/2020
Field of study

Contextualized word representations encode rich information about syntax and semantics, alongside specificities of each context of use. While contextual variation does not always reflect actual meaning shifts, it can still reduce the similarity of embeddings for word instances having the same meaning. We explore the imprint of two specific linguistic alternations, namely passivization and negation, on the representations generated by neural models trained with two different objectives: masked language modeling and translation. Our exploration methodology is inspired by an approach previously proposed for removing societal biases from word vectors. We show that passivization and negation leave their traces on the representations, and that neutralizing this information leads to more similar embeddings for words that should preserve their meaning in the transformation. We also find clear differences in how the respective features generalize across datasets.Peer reviewe

Crossref

Helsingin yliopiston digitaalinen arkisto

Learning and Using Context on a Humanoid Robot Using Latent Dirichlet Allocation

Author: Celikkanat Hande
Guerin Frank
Kalkan Sinan
Orhan Guner
Pugeault N
Sahin Erol
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/02/2016
Field of study

2014 Joint IEEE International Conferences on Development and Learning and Epigenetic Robotics (ICDL-Epirob), Genoa, Italy, 13-16 October 2014In this work, we model context in terms of a set of concepts grounded in a robot's sensorimotor interactions with the environment. For this end, we treat context as a latent variable in Latent Dirichlet Allocation, which is widely used in computational linguistics for modeling topics in texts. The flexibility of our approach allows many-to-many relationships between objects and contexts, as well as between scenes and contexts. We use a concept web representation of the perceptions of the robot as a basis for context analysis. The detected contexts of the scene can be used for several cognitive problems. Our results demonstrate that the robot can use learned contexts to improve object recognition and planning.Scientific and Technological Research Council of Turkey (TUBiTAK

Open Research Exeter

Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging

Author: Celikkanat Hande
Heinonen Markus
Talman Aarne
Tiedemann Jörg
Virpioja Sami
Publication venue
Publication date: 10/04/2023
Field of study

This paper introduces Bayesian uncertainty modeling using Stochastic Weight Averaging-Gaussian (SWAG) in Natural Language Understanding (NLU) tasks. We apply the approach to standard tasks in natural language inference (NLI) and demonstrate the effectiveness of the method in terms of prediction accuracy and correlation with human annotation disagreements. We argue that the uncertainty representations in SWAG better reflect subjective interpretation and the natural variation that is also present in human language understanding. The results reveal the importance of uncertainty modeling, an often neglected aspect of neural language modeling, in NLU tasks.Comment: NoDaLiDa 2023 camera read

arXiv.org e-Print Archive

Learning Context on a Humanoid Robot using Incremental Latent Dirichlet Allocation

Author: Celikkanat Hande
Guerin Frank
Kalkan Sinan
Orhan Guner
Pugeault N
Sahin Erol
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2015
Field of study

In this article, we formalize and model context in terms of a set of concepts grounded in the sensorimotor interactions of a robot. The concepts are modeled as a web using Markov Random Field, inspired from the concept web hypothesis for representing concepts in humans. On this concept web, we treat context as a latent variable of Latent Dirichlet Allocation (LDA), which is a widely-used method in computational linguistics for modeling topics in texts. We extend the standard LDA method in order to make it incremental so that (i) it does not re-learn everything from scratch given new interactions (i.e., it is online) and (ii) it can discover and add a new context into its model when necessary. We demonstrate on the iCub platform that, partly owing to modeling context on top of the concept web, our approach is adaptive, online and robust: It is adaptive and online since it can learn and discover a new context from new interactions. It is robust since it is not affected by irrelevant stimuli and it can discover contexts after a few interactions only. Moreover, we show how to use the context learned in such a model for two important tasks: object recognition and planning.Scientific and Technological Research Council of TurkeyMarie Curie International Outgoing Fellowship titled “Towards Better Robot Manipulation: Improvement through Interaction

University of Surrey

Open Research Exeter

Surrey Research Insight

OpenMETU (Middle East Technical University)

Are Multilingual Neural Machine Translation Models Better at Capturing Linguistic Features?

Author: Celikkanat Hande
Mareček David
Ravishankar Vinit
Silfverberg Miikka
Tiedemann Jörg
Publication venue
Publication date: 01/01/2020
Field of study

Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Author: Celikkanat Hande
Kakouros Sofoklis
Suni Antti
Talman Aarne
Tiedemann Jörg
Vainio Martti
Publication venue: 'Linkoping University Electronic Press'
Publication date: 06/08/2019
Field of study

In this paper we introduce a new natural language processing dataset and benchmark for predicting prosodic prominence from written text. To our knowledge this will be the largest publicly available dataset with prosodic labels. We describe the dataset construction and the resulting benchmark dataset in detail and train a number of different models ranging from feature-based classifiers to neural network systems for the prediction of discretized prosodic prominence. We show that pre-trained contextualized word representations from BERT outperform the other models even with less than 10% of the training data. Finally we discuss the dataset in light of the results and point to future research and plans for further improving both the dataset and methods of predicting prosodic prominence from text. The dataset and the code for the models are publicly available.Peer reviewe

arXiv.org e-Print Archive

Helsingin yliopiston digitaalinen arkisto

Decoding Emotional Valence from Electroencephalographic Rhythmic Activity

Author: Celikkanat Hande
Hyvärinen Aapo
Kauppi Jukka-Pekka
Kawanabe Motoaki
Moriya Hiroki
Ogawa Takeshi
Publication venue: IEEE
Publication date: 01/01/2017
Field of study

We attempt to decode emotional valence from electroencephalographic rhythmic activity in a naturalistic setting. We employ a data-driven method developed in a previous study, Spectral Linear Discriminant Analysis, to discover the relationships between the classification task and independent neuronal sources, optimally utilizing multiple frequency bands. A detailed investigation of the classifier provides insight into the neuronal sources related with emotional valence, and the individual differences of the subjects in processing emotions. Our findings show: (1) sources whose locations are similar across subjects are consistently involved in emotional responses, with the involvement of parietal sources being especially significant, and (2) even though the locations of the involved neuronal sources are consistent, subjects can display highly varying degrees of valence-related EEG activity in the sources.Peer reviewe

Jyväskylä University Digital Archive

Crossref

UCL Discovery

Helsingin yliopiston digitaalinen arkisto