Search CORE

3,035 research outputs found

Characterizing Geo-located Tweets in Brazilian Megacities

Author: Christina Gagnon (4247860)
Elizabeth Ottoni (4247866)
Luc DesGroseillers (59022)
Rémy Beaujois (314942)
Sami HSine (4247869)
Stéphanie Mollet (4247875)
Wildriss Viranaicken (347964)
Xin Zhang (35492)
Publication venue
Publication date: 01/01/2017
Field of study

This work presents a framework for collecting, processing and mining geo-located tweets in order to extract meaningful and actionable knowledge in the context of smart cities. We collected and characterized more than 9M tweets from the two biggest cities in Brazil, Rio de Janeiro and S\~ao Paulo. We performed topic modeling using the Latent Dirichlet Allocation model to produce an unsupervised distribution of semantic topics over the stream of geo-located tweets as well as a distribution of words over those topics. We manually labeled and aggregated similar topics obtaining a total of 29 different topics across both cities. Results showed similarities in the majority of topics for both cities, reflecting similar interests and concerns among the population of Rio de Janeiro and S\~ao Paulo. Nevertheless, some specific topics are more predominant in one of the cities

arXiv.org e-Print Archive

Crossref

FigShare

Characterizing Geo-located Tweets in Brazilian Megacities

Author: Cacho Nélio
Pasquali Arian
Pereira João
Rossetti Rosaldo
Saleiro Pedro
Publication venue
Publication date: 06/09/2017
Field of study

arXiv.org e-Print Archive

Crossref

POISED: Spotting Twitter Spam Off the Beaten Paths

Author: Fernandez Jose
Kruegel Christopher
Labreche Francois
Nilizadeh Shirin
Sedighian Alireza
Stringhini Gianluca
Vigna Giovanni
Zand Ali
Publication venue
Publication date: 01/01/2017
Field of study

Cybercriminals have found in online social networks a propitious medium to spread spam and malicious content. Existing techniques for detecting spam include predicting the trustworthiness of accounts and analyzing the content of these messages. However, advanced attackers can still successfully evade these defenses. Online social networks bring people who have personal connections or share common interests to form communities. In this paper, we first show that users within a networked community share some topics of interest. Moreover, content shared on these social network tend to propagate according to the interests of people. Dissemination paths may emerge where some communities post similar messages, based on the interests of those communities. Spam and other malicious content, on the other hand, follow different spreading patterns. In this paper, we follow this insight and present POISED, a system that leverages the differences in propagation between benign and malicious messages on social networks to identify spam and other unwanted content. We test our system on a dataset of 1.3M tweets collected from 64K users, and we show that our approach is effective in detecting malicious messages, reaching 91% precision and 93% recall. We also show that POISED's detection is more comprehensive than previous systems, by comparing it to three state-of-the-art spam detection systems that have been proposed by the research community in the past. POISED significantly outperforms each of these systems. Moreover, through simulations, we show how POISED is effective in the early detection of spam messages and how it is resilient against two well-known adversarial machine learning attacks

arXiv.org e-Print Archive