3,184 research outputs found

    Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features

    Full text link
    Satirical news is considered to be entertainment, but it is potentially deceptive and harmful. Despite the embedded genre in the article, not everyone can recognize the satirical cues and therefore believe the news as true news. We observe that satirical cues are often reflected in certain paragraphs rather than the whole document. Existing works only consider document-level features to detect the satire, which could be limited. We consider paragraph-level linguistic features to unveil the satire by incorporating neural network and attention mechanism. We investigate the difference between paragraph-level features and document-level features, and analyze them on a large satirical news dataset. The evaluation shows that the proposed model detects satirical news effectively and reveals what features are important at which level.Comment: EMNLP 2017, 11 page

    An Emotional Analysis of False Information in Social Media and News Articles

    Full text link
    [EN] Fake news is risky since it has been created to manipulate the readers' opinions and beliefs. In this work, we compared the language of false news to the real one of real news from an emotional perspective, considering a set of false information types (propaganda, hoax, clickbait, and satire) from social media and online news articles sources. Our experiments showed that false information has different emotional patterns in each of its types, and emotions play a key role in deceiving the reader. Based on that, we proposed a LSTM neural network model that is emotionally-infused to detect false news.The work of the second author was partially funded by the Spanish MICINN under the research project MISMISFAKEnHATE on Misinformation and Miscommunication in social media: FAKEnews and HATE speech (PGC2018-096212B-C31).Ghanem, BHH.; Rosso, P.; Rangel, F. (2020). An Emotional Analysis of False Information in Social Media and News Articles. ACM Transactions on Internet Technology. 20(2):1-18. https://doi.org/10.1145/3381750S118202Magda B. Arnold. 1960. Emotion and Personality. Columbia University Press. Magda B. Arnold. 1960. Emotion and Personality. Columbia University Press.Bhatt, G., Sharma, A., Sharma, S., Nagpal, A., Raman, B., & Mittal, A. (2018). Combining Neural, Statistical and External Features for Fake News Stance Identification. Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW ’18. doi:10.1145/3184558.3191577Castillo, C., Mendoza, M., & Poblete, B. (2011). Information credibility on twitter. Proceedings of the 20th international conference on World wide web - WWW ’11. doi:10.1145/1963405.1963500Chakraborty, A., Paranjape, B., Kakarla, S., & Ganguly, N. (2016). Stop Clickbait: Detecting and preventing clickbaits in online news media. 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). doi:10.1109/asonam.2016.7752207Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6(3-4), 169-200. doi:10.1080/02699939208411068Ghanem, B., Rosso, P., & Rangel, F. (2018). Stance Detection in Fake News A Combined Feature Representation. Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). doi:10.18653/v1/w18-5510Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. doi:10.1162/neco.1997.9.8.1735Karadzhov, G., Nakov, P., Màrquez, L., Barrón-Cedeño, A., … Koychev, I. (2017). Fully Automated Fact Checking Using External Sources. RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning. doi:10.26615/978-954-452-049-6_046Kumar, S., West, R., & Leskovec, J. (2016). Disinformation on the Web. Proceedings of the 25th International Conference on World Wide Web. doi:10.1145/2872427.2883085Li, X., Meng, W., & Yu, C. (2011). T-verifier: Verifying truthfulness of fact statements. 2011 IEEE 27th International Conference on Data Engineering. doi:10.1109/icde.2011.5767859Nyhan, B., & Reifler, J. (2010). When Corrections Fail: The Persistence of Political Misperceptions. Political Behavior, 32(2), 303-330. doi:10.1007/s11109-010-9112-2Plutchik, R. (2001). The Nature of Emotions. American Scientist, 89(4), 344. doi:10.1511/2001.4.344Popat, K., Mukherjee, S., Strötgen, J., & Weikum, G. (2016). Credibility Assessment of Textual Claims on the Web. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. doi:10.1145/2983323.2983661Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., & Bandyopadhyay, S. (2013). Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining. IEEE Intelligent Systems, 28(2), 31-38. doi:10.1109/mis.2013.4Rangel, F., & Rosso, P. (2016). On the impact of emotions on author profiling. Information Processing & Management, 52(1), 73-92. doi:10.1016/j.ipm.2015.06.003Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017). Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. doi:10.18653/v1/d17-1317Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. doi:10.1145/3132847.3132877Tausczik, Y. R., & Pennebaker, J. W. (2009). The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology, 29(1), 24-54. doi:10.1177/0261927x09351676Volkova, S., Shaffer, K., Jang, J. Y., & Hodas, N. (2017). Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). doi:10.18653/v1/p17-2102Zhao, Z., Resnick, P., & Mei, Q. (2015). Enquiring Minds. Proceedings of the 24th International Conference on World Wide Web. doi:10.1145/2736277.274163

    On the Detection of False Information: From Rumors to Fake News

    Full text link
    Tesis por compendio[ES] En tiempos recientes, el desarrollo de las redes sociales y de las agencias de noticias han traído nuevos retos y amenazas a la web. Estas amenazas han llamado la atención de la comunidad investigadora en Procesamiento del Lenguaje Natural (PLN) ya que están contaminando las plataformas de redes sociales. Un ejemplo de amenaza serían las noticias falsas, en las que los usuarios difunden y comparten información falsa, inexacta o engañosa. La información falsa no se limita a la información verificable, sino que también incluye información que se utiliza con fines nocivos. Además, uno de los desafíos a los que se enfrentan los investigadores es la gran cantidad de usuarios en las plataformas de redes sociales, donde detectar a los difusores de información falsa no es tarea fácil. Los trabajos previos que se han propuesto para limitar o estudiar el tema de la detección de información falsa se han centrado en comprender el lenguaje de la información falsa desde una perspectiva lingüística. En el caso de información verificable, estos enfoques se han propuesto en un entorno monolingüe. Además, apenas se ha investigado la detección de las fuentes o los difusores de información falsa en las redes sociales. En esta tesis estudiamos la información falsa desde varias perspectivas. En primer lugar, dado que los trabajos anteriores se centraron en el estudio de la información falsa en un entorno monolingüe, en esta tesis estudiamos la información falsa en un entorno multilingüe. Proponemos diferentes enfoques multilingües y los comparamos con un conjunto de baselines monolingües. Además, proporcionamos estudios sistemáticos para los resultados de la evaluación de nuestros enfoques para una mejor comprensión. En segundo lugar, hemos notado que el papel de la información afectiva no se ha investigado en profundidad. Por lo tanto, la segunda parte de nuestro trabajo de investigación estudia el papel de la información afectiva en la información falsa y muestra cómo los autores de contenido falso la emplean para manipular al lector. Aquí, investigamos varios tipos de información falsa para comprender la correlación entre la información afectiva y cada tipo (Propaganda, Trucos / Engaños, Clickbait y Sátira). Por último, aunque no menos importante, en un intento de limitar su propagación, también abordamos el problema de los difusores de información falsa en las redes sociales. En esta dirección de la investigación, nos enfocamos en explotar varias características basadas en texto extraídas de los mensajes de perfiles en línea de tales difusores. Estudiamos diferentes conjuntos de características que pueden tener el potencial de ayudar a discriminar entre difusores de información falsa y verificadores de hechos.[CA] En temps recents, el desenvolupament de les xarxes socials i de les agències de notícies han portat nous reptes i amenaces a la web. Aquestes amenaces han cridat l'atenció de la comunitat investigadora en Processament de Llenguatge Natural (PLN) ja que estan contaminant les plataformes de xarxes socials. Un exemple d'amenaça serien les notícies falses, en què els usuaris difonen i comparteixen informació falsa, inexacta o enganyosa. La informació falsa no es limita a la informació verificable, sinó que també inclou informació que s'utilitza amb fins nocius. A més, un dels desafiaments als quals s'enfronten els investigadors és la gran quantitat d'usuaris en les plataformes de xarxes socials, on detectar els difusors d'informació falsa no és tasca fàcil. Els treballs previs que s'han proposat per limitar o estudiar el tema de la detecció d'informació falsa s'han centrat en comprendre el llenguatge de la informació falsa des d'una perspectiva lingüística. En el cas d'informació verificable, aquests enfocaments s'han proposat en un entorn monolingüe. A més, gairebé no s'ha investigat la detecció de les fonts o els difusors d'informació falsa a les xarxes socials. En aquesta tesi estudiem la informació falsa des de diverses perspectives. En primer lloc, atès que els treballs anteriors es van centrar en l'estudi de la informació falsa en un entorn monolingüe, en aquesta tesi estudiem la informació falsa en un entorn multilingüe. Proposem diferents enfocaments multilingües i els comparem amb un conjunt de baselines monolingües. A més, proporcionem estudis sistemàtics per als resultats de l'avaluació dels nostres enfocaments per a una millor comprensió. En segon lloc, hem notat que el paper de la informació afectiva no s'ha investigat en profunditat. Per tant, la segona part del nostre treball de recerca estudia el paper de la informació afectiva en la informació falsa i mostra com els autors de contingut fals l'empren per manipular el lector. Aquí, investiguem diversos tipus d'informació falsa per comprendre la correlació entre la informació afectiva i cada tipus (Propaganda, Trucs / Enganys, Clickbait i Sàtira). Finalment, però no menys important, en un intent de limitar la seva propagació, també abordem el problema dels difusors d'informació falsa a les xarxes socials. En aquesta direcció de la investigació, ens enfoquem en explotar diverses característiques basades en text extretes dels missatges de perfils en línia de tals difusors. Estudiem diferents conjunts de característiques que poden tenir el potencial d'ajudar a discriminar entre difusors d'informació falsa i verificadors de fets.[EN] In the recent years, the development of social media and online news agencies has brought several challenges and threats to the Web. These threats have taken the attention of the Natural Language Processing (NLP) research community as they are polluting the online social media platforms. One of the examples of these threats is false information, in which false, inaccurate, or deceptive information is spread and shared by online users. False information is not limited to verifiable information, but it also involves information that is used for harmful purposes. Also, one of the challenges that researchers have to face is the massive number of users in social media platforms, where detecting false information spreaders is not an easy job. Previous work that has been proposed for limiting or studying the issue of detecting false information has focused on understanding the language of false information from a linguistic perspective. In the case of verifiable information, approaches have been proposed in a monolingual setting. Moreover, detecting the sources or the spreaders of false information in social media has not been investigated much. In this thesis we study false information from several aspects. First, since previous work focused on studying false information in a monolingual setting, in this thesis we study false information in a cross-lingual one. We propose different cross-lingual approaches and we compare them to a set of monolingual baselines. Also, we provide systematic studies for the evaluation results of our approaches for better understanding. Second, we noticed that the role of affective information was not investigated in depth. Therefore, the second part of our research work studies the role of the affective information in false information and shows how the authors of false content use it to manipulate the reader. Here, we investigate several types of false information to understand the correlation between affective information and each type (Propaganda, Hoax, Clickbait, Rumor, and Satire). Last but not least, in an attempt to limit its spread, we also address the problem of detecting false information spreaders in social media. In this research direction, we focus on exploiting several text-based features extracted from the online profile messages of those spreaders. We study different feature sets that can have the potential to help to identify false information spreaders from fact checkers.Ghanem, BHH. (2020). On the Detection of False Information: From Rumors to Fake News [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/158570TESISCompendi

    A Multimodal Approach to Sarcasm Detection on Social Media

    Get PDF
    In recent times, a major share of human communication takes place online. The main reason being the ease of communication on social networking sites (SNSs). Due to the variety and large number of users, SNSs have drawn the attention of the computer science (CS) community, particularly the affective computing (also known as emotional AI), information retrieval, natural language processing, and data mining groups. Researchers are trying to make computers understand the nuances of human communication including sentiment and sarcasm. Emotion or sentiment detection requires more insights about the communication than it does for factual information retrieval. Sarcasm detection is particularly more difficult than categorizing sentiment. Because, in sarcasm, the intended meaning of the expression by the user is opposite to the literal meaning. Because of its complex nature, it is often difficult even for human to detect sarcasm without proper context. However, people on social media succeed in detecting sarcasm despite interacting with strangers across the world. That motivates us to investigate the human process of detecting sarcasm on social media where abundant context information is often unavailable and the group of users communicating with each other are rarely well-acquainted. We have conducted a qualitative study to examine the patterns of users conveying sarcasm on social media. Whereas most sarcasm detection systems deal in word-by-word basis to accomplish their goal, we focused on the holistic sentiment conveyed by the post. We argue that utilization of word-level information will limit the systems performance to the domain of the dataset used to train the system and might not perform well for non-English language. As an endeavor to make our system less dependent on text data, we proposed a multimodal approach for sarcasm detection. We showed the applicability of images and reaction emoticons as other sources of hints about the sentiment of the post. Our research showed the superior results from a multimodal approach when compared to a unimodal approach. Multimodal sarcasm detection systems, as the one presented in this research, with the inclusion of more modes or sources of data might lead to a better sarcasm detection model

    Figurative Language Detection using Deep Learning and Contextual Features

    Get PDF
    The size of data shared over the Internet today is gigantic. A big bulk of it comes from postings on social networking sites such as Twitter and Facebook. Some of it also comes from online news sites such as CNN and The Onion. This type of data is very good for data analysis since they are very personalized and specific. For years, researchers in academia and various industries have been analyzing this type of data. The purpose includes product marketing, event monitoring, and trend analysis. The highest usage for this type of analysis is to find out the sentiments of the public about a certain topic or product. This field is called sentiment analysis. The writers of such posts have no obligation to stick to only literal language. They also have the freedom to use figurative language in their publications. Hence, online posts can be categorized into two: Literal and Figurative. Literal posts contain words or sentences that are direct or straight to the point. On the contrary, figurative posts contain words, phrases, or sentences that carry different meanings than usual. This could flip the whole polarity of a given post. Due to this nature, it can jeopardize sentiment analysis works that focus primarily on the polarity of the posts. This makes figurative language one of the biggest problems in sentiment analysis. Hence, detecting it would be crucial and significant. However, the study of figurative language detection is non-trivial. There have been many existing works that tried to execute the task of detecting figurative language correctly, with different methodologies used. The results are impressive but still can be improved. This thesis offers a new way to solve this problem. There are essentially seven commonly used figurative language categories: sarcasm, metaphor, satire, irony, simile, humor, and hyperbole. This thesis focuses on three categories. The thesis aims to understand the contextual meaning behind the three figurative language categories, using a combination of deep learning architecture with manually extracted features and explore the use of well know machine learning classifiers for the detection tasks. In the process, it also aims to describe a descending list of features according to the importance. The deep learning architecture used in this work is Convolutional Neural Network, which is combined with manually extracted features that are carefully chosen based on the literature and understanding of each figurative language. The findings of this work clearly showed improvement in the evaluation metrics when compared to existing works in the same domain. This happens in all of the figurative language categories, proving the framework’s possession of quality

    Mapping (Dis-)Information Flow about the MH17 Plane Crash

    Get PDF
    Digital media enables not only fast sharing of information, but also disinformation. One prominent case of an event leading to circulation of disinformation on social media is the MH17 plane crash. Studies analysing the spread of information about this event on Twitter have focused on small, manually annotated datasets, or used proxys for data annotation. In this work, we examine to what extent text classifiers can be used to label data for subsequent content analysis, in particular we focus on predicting pro-Russian and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though we find that a neural classifier improves over a hashtag based baseline, labeling pro-Russian and pro-Ukrainian content with high precision remains a challenging problem. We provide an error analysis underlining the difficulty of the task and identify factors that might help improve classification in future work. Finally, we show how the classifier can facilitate the annotation task for human annotators

    Sarcasm Detection in English and Arabic Tweets Using Transformer Models

    Get PDF
    This thesis describes our approach toward the detection of sarcasm and its various types in English and Arabic Tweets through methods in deep learning. There are five problems we attempted: (1) detection of sarcasm in English Tweets, (2) detection of sarcasm in Arabic Tweets, (3) determining the type of sarcastic speech subcategory for English Tweets, (4) determining which of two semantically equivalent English Tweets is sarcastic, and (5) determining which of two semantically equivalent Arabic Tweets is sarcastic. All tasks were framed as classification problems, and our contributions are threefold: (a) we developed an English binary classifier system with RoBERTa, (b) an Arabic binary classifier with XLM-RoBERTa, and (c) an English multilabel classifier with BERT. Pre-processing steps are taken with labeled input data prior to tokenization, such as extracting and appending verbs/adjectives or representative/significant keywords to the end of an input tweet to help the models better understand and generalize sarcasm detection. We also discuss the results of simple data augmentation techniques to improve the quality of the given training dataset as well as an alternative approach to the question of multilabel sequence classification. Ultimately, our systems place us in the top 14 participants for each of the five tasks in a sarcasm detection competition
    • …
    corecore