2 research outputs found
Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels using Statistical Sampling and Post-Processing
In this paper, we present our participation in SemEval-2020 Task-12 Subtask-A
(English Language) which focuses on offensive language identification from
noisy labels. To this end, we developed a hybrid system with the BERT
classifier trained with tweets selected using Statistical Sampling Algorithm
(SA) and Post-Processed (PP) using an offensive wordlist. Our developed system
achieved 34 th position with Macro-averaged F1-score (Macro-F1) of 0.90913 over
both offensive and non-offensive classes. We further show comprehensive results
and error analysis to assist future research in offensive language
identification with noisy labels.Comment: preprint v1, Under submission for SemEval 2020 Worksho
Investigating the Effect of Intraclass Variability in Temporal Ensembling
Temporal Ensembling is a semi-supervised approach that allows training deep
neural network models with a small number of labeled images. In this paper, we
present our preliminary study on the effect of intraclass variability on
temporal ensembling, with a focus on seed size and seed type, respectively.
Through our experiments we find that (a) there is a significant drop in
accuracy with datasets that offer high intraclass variability, (b) more seed
images offer consistently higher accuracy across the datasets, and (c) seed
type indeed has an impact on the overall efficiency, where it produces a
spectrum of accuracy both lower and higher. Additionally, based on our
experiments, we also find KMNIST to be a competitive baseline for temporal
ensembling.Comment: Preliminary Results; More Experiments to be adde