research

Cross-language tweet classification using Bing translator

Abstract

Master of ScienceDepartment of Computing and Information SciencesDoina CarageaSocial media affects our daily lives. It is one of the first sources for finding breaking news. In particular, Twitter is one of the popular social media platforms, with around 330 million monthly users. From local events such as Fake Patty's Day to across the world happenings - Twitter gets there first. During a disaster, tweets can be used to post warnings, status of available medical and food supply, emergency personnel, and updates. Users were practically tweeting about the Hurricane Sandy, despite lack of network during the storm. Analysis of these tweets can help monitor the disaster, plan and manage the crisis, and aid in research. In this research, we use the publicly available tweets posted during several disasters and identify the relevant tweets. As the languages in the datasets are different, Bing translation API has been used to detect and translate the tweets. The translations are then, used as training datasets for supervised machine learning algorithms. Supervised learning is the process of learning from a labeled training dataset. This learned classifier can then be used to predict the correct output for any valid input. When trained to more observations, the algorithm improves its predictive performance

    Similar works