625 research outputs found

    Deobfuscating Leetspeak With Deep Learning to Improve Spam Filtering

    Get PDF
    The evolution of anti-spam filters has forced spammers to make greater efforts to bypass filters in order to distribute content over networks. The distribution of content encoded in images or the use of Leetspeak are concrete and clear examples of techniques currently used to bypass filters. Despite the importance of dealing with these problems, the number of studies to solve them is quite small, and the reported performance is very limited. This study reviews the work done so far (very rudimentary) for Leetspeak deobfuscation and proposes a new technique based on using neural networks for decoding purposes. In addition, we distribute an image database specifically created for training Leetspeak decoding models. We have also created and made available four different corpora to analyse the performance of Leetspeak decoding schemes. Using these corpora, we have experimentally evaluated our neural network approach for decoding Leetspeak. The results obtained have shown the usefulness of the proposed model for addressing the deobfuscation of Leetspeak character sequences

    Predictive Analysis on Twitter: Techniques and Applications

    Full text link
    Predictive analysis of social media data has attracted considerable attention from the research community as well as the business world because of the essential and actionable information it can provide. Over the years, extensive experimentation and analysis for insights have been carried out using Twitter data in various domains such as healthcare, public health, politics, social sciences, and demographics. In this chapter, we discuss techniques, approaches and state-of-the-art applications of predictive analysis of Twitter data. Specifically, we present fine-grained analysis involving aspects such as sentiment, emotion, and the use of domain knowledge in the coarse-grained analysis of Twitter data for making decisions and taking actions, and relate a few success stories

    Smart: detección de SPAM en comentarios realizados en la red social EkitApp

    Get PDF
    En este Trabajo de Fin de Grado, se desarrolla el sistema Smart, el cual proporciona un filtro de comentarios Spam en tiempo real. Smart se integra exitosamente con EkitApp, una red social de eventos universitarios fruto de la idea de un emprendizaje en colaboración con Julen Miner. Esta solución combina algunas de las técnicas más efectivas halladas hasta el momento, y proporciona un enfoque práctico y completo de lo que significa realizar un producto en el sector del Data Science, desde la recogida de datos hasta su despliegue; posibilitando su uso desde cualquier parte del mundo

    Smart: detección de SPAM en comentarios realizados en la red social EkitApp

    Get PDF
    En este Trabajo de Fin de Grado, se desarrolla el sistema Smart, el cual proporciona un filtro de comentarios Spam en tiempo real. Smart se integra exitosamente con EkitApp, una red social de eventos universitarios fruto de la idea de un emprendizaje en colaboración con Julen Miner. Esta solución combina algunas de las técnicas más efectivas halladas hasta el momento, y proporciona un enfoque práctico y completo de lo que significa realizar un producto en el sector del Data Science, desde la recogida de datos hasta su despliegue; posibilitando su uso desde cualquier parte del mundo

    Social software for music

    Get PDF
    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto. 200

    Detecting and Monitoring Hate Speech in Twitter

    Get PDF
    Social Media are sensors in the real world that can be used to measure the pulse of societies. However, the massive and unfiltered feed of messages posted in social media is a phenomenon that nowadays raises social alarms, especially when these messages contain hate speech targeted to a specific individual or group. In this context, governments and non-governmental organizations (NGOs) are concerned about the possible negative impact that these messages can have on individuals or on the society. In this paper, we present HaterNet, an intelligent system currently being used by the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that identifies and monitors the evolution of hate speech in Twitter. The contributions of this research are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification approaches based on different document representation strategies and text classification models. (4) The best approach consists of a combination of a LTSM+MLP neural network that takes as input the tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the literatureThe work by Quijano-Sanchez was supported by the Spanish Ministry of Science and Innovation grant FJCI-2016-28855. The research of Liberatore was supported by the Government of Spain, grant MTM2015-65803-R, and by the European Union’s Horizon 2020 Research and Innovation Programme, under the Marie Sklodowska-Curie grant agreement No. 691161 (GEOSAFE). All the financial support is gratefully acknowledge
    • …