2 research outputs found

    Hitachi at SemEval-2020 Task 12: Offensive Language Identification with Noisy Labels using Statistical Sampling and Post-Processing

    Full text link
    In this paper, we present our participation in SemEval-2020 Task-12 Subtask-A (English Language) which focuses on offensive language identification from noisy labels. To this end, we developed a hybrid system with the BERT classifier trained with tweets selected using Statistical Sampling Algorithm (SA) and Post-Processed (PP) using an offensive wordlist. Our developed system achieved 34 th position with Macro-averaged F1-score (Macro-F1) of 0.90913 over both offensive and non-offensive classes. We further show comprehensive results and error analysis to assist future research in offensive language identification with noisy labels.Comment: preprint v1, Under submission for SemEval 2020 Worksho

    Investigating the Effect of Intraclass Variability in Temporal Ensembling

    Full text link
    Temporal Ensembling is a semi-supervised approach that allows training deep neural network models with a small number of labeled images. In this paper, we present our preliminary study on the effect of intraclass variability on temporal ensembling, with a focus on seed size and seed type, respectively. Through our experiments we find that (a) there is a significant drop in accuracy with datasets that offer high intraclass variability, (b) more seed images offer consistently higher accuracy across the datasets, and (c) seed type indeed has an impact on the overall efficiency, where it produces a spectrum of accuracy both lower and higher. Additionally, based on our experiments, we also find KMNIST to be a competitive baseline for temporal ensembling.Comment: Preliminary Results; More Experiments to be adde
    corecore