Multilingual Text Classification from Twitter during Emergencies

Abstract

Social media such as Twitter are a valuable source of information due to their diffusion among citizens and to their speed in sharing data worldwide. However, it is challenging to automatically extract information from such data, given the huge amount of useless content. We propose a multilingual tool that automatically categorizes tweets according to their information content. To achieve real-time classification while supporting any language, we apply a deep learning classifier, using multilingual word embeddings. This allows our solution to be trained on one language and to apply it to any other language via zero-shot inference achieving acceptable performance loss

    Similar works