2,995 research outputs found

    Veracity and velocity of social media content during breaking news: analysis of November 2015 Paris shootings

    No full text
    Social media sources are becoming increasingly important in journalism. Under breaking news deadlines semi-automated support for identification and verification of content is critical. We describe a large scale content-level analysis of over 6 million Twitter, You Tube and Instagram records covering the first 6 hours of the November 2015 Paris shootings. We ground our analysis by tracing how 5 ground truth images used in actual news reports went viral. We look at velocity of newsworthy content and its veracity with regards trusted source attribution. We also examine temporal segmentation combined with statistical frequency counters to identify likely eyewitness content for input to real-time breaking content feeds. Our results suggest attribution to trusted sources might be a good indicator of content veracity, and that temporal segmentation coupled with frequency statistical metrics could be used to highlight in real-time eyewitness content if applied with some additional text filters

    An Emotional Analysis of False Information in Social Media and News Articles

    Full text link
    [EN] Fake news is risky since it has been created to manipulate the readers' opinions and beliefs. In this work, we compared the language of false news to the real one of real news from an emotional perspective, considering a set of false information types (propaganda, hoax, clickbait, and satire) from social media and online news articles sources. Our experiments showed that false information has different emotional patterns in each of its types, and emotions play a key role in deceiving the reader. Based on that, we proposed a LSTM neural network model that is emotionally-infused to detect false news.The work of the second author was partially funded by the Spanish MICINN under the research project MISMISFAKEnHATE on Misinformation and Miscommunication in social media: FAKEnews and HATE speech (PGC2018-096212B-C31).Ghanem, BHH.; Rosso, P.; Rangel, F. (2020). An Emotional Analysis of False Information in Social Media and News Articles. ACM Transactions on Internet Technology. 20(2):1-18. https://doi.org/10.1145/3381750S118202Magda B. Arnold. 1960. Emotion and Personality. Columbia University Press. Magda B. Arnold. 1960. Emotion and Personality. Columbia University Press.Bhatt, G., Sharma, A., Sharma, S., Nagpal, A., Raman, B., & Mittal, A. (2018). Combining Neural, Statistical and External Features for Fake News Stance Identification. Companion of the The Web Conference 2018 on The Web Conference 2018 - WWW ’18. doi:10.1145/3184558.3191577Castillo, C., Mendoza, M., & Poblete, B. (2011). Information credibility on twitter. Proceedings of the 20th international conference on World wide web - WWW ’11. doi:10.1145/1963405.1963500Chakraborty, A., Paranjape, B., Kakarla, S., & Ganguly, N. (2016). Stop Clickbait: Detecting and preventing clickbaits in online news media. 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). doi:10.1109/asonam.2016.7752207Ekman, P. (1992). An argument for basic emotions. Cognition and Emotion, 6(3-4), 169-200. doi:10.1080/02699939208411068Ghanem, B., Rosso, P., & Rangel, F. (2018). Stance Detection in Fake News A Combined Feature Representation. Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). doi:10.18653/v1/w18-5510Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735-1780. doi:10.1162/neco.1997.9.8.1735Karadzhov, G., Nakov, P., Màrquez, L., Barrón-Cedeño, A., … Koychev, I. (2017). Fully Automated Fact Checking Using External Sources. RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning. doi:10.26615/978-954-452-049-6_046Kumar, S., West, R., & Leskovec, J. (2016). Disinformation on the Web. Proceedings of the 25th International Conference on World Wide Web. doi:10.1145/2872427.2883085Li, X., Meng, W., & Yu, C. (2011). T-verifier: Verifying truthfulness of fact statements. 2011 IEEE 27th International Conference on Data Engineering. doi:10.1109/icde.2011.5767859Nyhan, B., & Reifler, J. (2010). When Corrections Fail: The Persistence of Political Misperceptions. Political Behavior, 32(2), 303-330. doi:10.1007/s11109-010-9112-2Plutchik, R. (2001). The Nature of Emotions. American Scientist, 89(4), 344. doi:10.1511/2001.4.344Popat, K., Mukherjee, S., Strötgen, J., & Weikum, G. (2016). Credibility Assessment of Textual Claims on the Web. Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. doi:10.1145/2983323.2983661Poria, S., Gelbukh, A., Hussain, A., Howard, N., Das, D., & Bandyopadhyay, S. (2013). Enhanced SenticNet with Affective Labels for Concept-Based Opinion Mining. IEEE Intelligent Systems, 28(2), 31-38. doi:10.1109/mis.2013.4Rangel, F., & Rosso, P. (2016). On the impact of emotions on author profiling. Information Processing & Management, 52(1), 73-92. doi:10.1016/j.ipm.2015.06.003Rashkin, H., Choi, E., Jang, J. Y., Volkova, S., & Choi, Y. (2017). Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. doi:10.18653/v1/d17-1317Ruchansky, N., Seo, S., & Liu, Y. (2017). CSI. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. doi:10.1145/3132847.3132877Tausczik, Y. R., & Pennebaker, J. W. (2009). The Psychological Meaning of Words: LIWC and Computerized Text Analysis Methods. Journal of Language and Social Psychology, 29(1), 24-54. doi:10.1177/0261927x09351676Volkova, S., Shaffer, K., Jang, J. Y., & Hodas, N. (2017). Separating Facts from Fiction: Linguistic Models to Classify Suspicious and Trusted News Posts on Twitter. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). doi:10.18653/v1/p17-2102Zhao, Z., Resnick, P., & Mei, Q. (2015). Enquiring Minds. Proceedings of the 24th International Conference on World Wide Web. doi:10.1145/2736277.274163

    Web knowledge bases

    Get PDF
    Knowledge is key to natural language understanding. References to specific people, places and things in text are crucial to resolving ambiguity and extracting meaning. Knowledge Bases (KBs) codify this information for automated systems — enabling applications such as entity-based search and question answering. This thesis explores the idea that sites on the web may act as a KB, even if that is not their primary intent. Dedicated kbs like Wikipedia are a rich source of entity information, but are built and maintained at an ongoing cost in human effort. As a result, they are generally limited in terms of the breadth and depth of knowledge they index about entities. Web knowledge bases offer a distributed solution to the problem of aggregating entity knowledge. Social networks aggregate content about people, news sites describe events with tags for organizations and locations, and a diverse assortment of web directories aggregate statistics and summaries for long-tail entities notable within niche movie, musical and sporting domains. We aim to develop the potential of these resources for both web-centric entity Information Extraction (IE) and structured KB population. We first investigate the problem of Named Entity Linking (NEL), where systems must resolve ambiguous mentions of entities in text to their corresponding node in a structured KB. We demonstrate that entity disambiguation models derived from inbound web links to Wikipedia are able to complement and in some cases completely replace the role of resources typically derived from the KB. Building on this work, we observe that any page on the web which reliably disambiguates inbound web links may act as an aggregation point for entity knowledge. To uncover these resources, we formalize the task of Web Knowledge Base Discovery (KBD) and develop a system to automatically infer the existence of KB-like endpoints on the web. While extending our framework to multiple KBs increases the breadth of available entity knowledge, we must still consolidate references to the same entity across different web KBs. We investigate this task of Cross-KB Coreference Resolution (KB-Coref) and develop models for efficiently clustering coreferent endpoints across web-scale document collections. Finally, assessing the gap between unstructured web knowledge resources and those of a typical KB, we develop a neural machine translation approach which transforms entity knowledge between unstructured textual mentions and traditional KB structures. The web has great potential as a source of entity knowledge. In this thesis we aim to first discover, distill and finally transform this knowledge into forms which will ultimately be useful in downstream language understanding tasks
    • …
    corecore