7 research outputs found
SkinDistilViT: Lightweight Vision Transformer for Skin Lesion Classification
Skin cancer is a treatable disease if discovered early. We provide a
production-specific solution to the skin cancer classification problem that
matches human performance in melanoma identification by training a vision
transformer on melanoma medical images annotated by experts. Since inference
cost, both time and memory wise is important in practice, we employ knowledge
distillation to obtain a model that retains 98.33% of the teacher's balanced
multi-class accuracy, at a fraction of the cost. Memory-wise, our model is
49.60% smaller than the teacher. Time-wise, our solution is 69.25% faster on
GPU and 97.96% faster on CPU. By adding classification heads at each level of
the transformer and employing a cascading distillation process, we improve the
balanced multi-class accuracy of the base model by 2.1%, while creating a range
of models of various sizes but comparable performance. We provide the code at
https://github.com/Longman-Stan/SkinDistilVit.Comment: Accepted at ICANN 202
UPB @ DANKMEMES: Italian Memes Analysis - Employing Visual Models and Graph Convolutional Networks for Meme Identification and Hate Speech Detection
Certain events or political situations determine users from the online environment to express themselves by using different modalities. One of them is represented by Internet memes, which combine text with a representative image to entail a wide range of emotions, from humor to sarcasm and even hate. In this paper, we describe our approach for the DANKMEMES competition from EVALITA 2020 consisting of a multimodal multi-task learning architecture based on two main components. The first one is a Graph Convolutional Network combined with an Italian BERT for text encoding, while the second is varied between different image-based architectures (i.e., ResNet50, ResNet152, and VGG-16) for image representation. Our solution achieves good performance on the first two tasks of the current competition, ranking 3rd for both Task 1 (.8437 macro-F1 score) and Task 2 (.8169 macro-F1 score), while exceeding by high margins the official baselines
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
Welcome to EVALITA 2020! EVALITA is the evaluation campaign of Natural Language Processing and Speech Tools for Italian. EVALITA is an initiative of the Italian Association for Computational Linguistics (AILC, http://www.ai-lc.it) and it is endorsed by the Italian Association for Artificial Intelligence (AIxIA, http://www.aixia.it) and the Italian Association for Speech Sciences (AISV, http://www.aisv.it)
TA-DA: Topic-Aware Domain Adaptation for Scientific Keyphrase Identification and Classification (Student Abstract)
Keyphrase identification and classification is a Natural Language Processing and Information Retrieval task that involves extracting relevant groups of words from a given text related to the main topic. In this work, we focus on extracting keyphrases from scientific documents. We introduce TA-DA, a Topic-Aware Domain Adaptation framework for keyphrase extraction that integrates Multi-Task Learning with Adversarial Training and Domain Adaptation. Our approach improves performance over baseline models by up to 5% in the exact match of the F1-score