148 research outputs found

    Similarity-aware deep attentive model for clickbait detection

    Full text link
    © Springer Nature Switzerland AG 2019. Clickbait is a type of web content advertisements designed to entice readers into clicking accompanying links. Usually, such links will lead to articles that are either misleading or non-informative, making the detection of clickbait essential for our daily lives. Automated clickbait detection is a relatively new research topic. Most recent work handles the clickbait detection problem with deep learning approaches to extract features from the meta-data of content. However, little attention has been paid to the relationship between the misleading titles and the target content, which we found to be an important clue for enhancing clickbait detection. In this work, we propose a deep similarity-aware attentive model to capture and represent such similarities with better expressiveness. In particular, we present the ways of either using similarity only or integrating it with other available quality features for the clickbait detection. We evaluate our model on two benchmark datasets, and the experimental results demonstrate the effectiveness of our approach by outperforming a series of competitive state-of-the-arts and baseline methods

    Flagging clickbait in Indonesian online news websites using fine-tuned transformers

    Get PDF
    Click counts are related to the amount of money that online advertisers paid to news sites. Such business models forced some news sites to employ a dirty trick of click-baiting, i.e., using hyperbolic and interesting words, sometimes unfinished sentences in a headline to purposefully tease the readers. Some Indonesian online news sites also joined the party of clickbait, which indirectly degrade other established news sites' credibility. A neural network with a pre-trained language model multilingual bidirectional encoder representations from transformers (BERT) that acted as an embedding layer is then combined with a 100 node-hidden layer and topped with a sigmoid classifier was trained to detect clickbait headlines. With a total of 6,632 headlines as a training dataset, the classifier performed remarkably well. Evaluated with 5-fold cross-validation, it has an accuracy score of 0.914, an F1-score of 0.914, a precision score of 0.916, and a receiver operating characteristic-area under curve (ROC-AUC) of 0.92. The usage of multilingual BERT in the Indonesian text classification task was tested and is possible to be enhanced further. Future possibilities, societal impact, and limitations of clickbait detection are discussed

    Deep Neural Attention for Misinformation and Deception Detection

    Get PDF
    PhD thesis in Information technologyAt present the influence of social media on society is so much that without it life seems to have no meaning for many. This kind of over-reliance on social media gives an opportunity to the anarchic elements to take undue advantage. Online misinformation and deception are vivid examples of such phenomenon. The misinformation or fake news spreads faster and wider than the true news [32]. The need of the hour is to identify and curb the spread of misinformation and misleading content automatically at the earliest. Several machine learning models have been proposed by the researchers to detect and prevent misinformation and deceptive content. However, these prior works suffer from some limitations: First, they either use feature engineering heavy methods or use intricate deep neural architectures, which are not so transparent in terms of their internal working and decision making. Second, they do not incorporate and learn the available auxiliary and latent cues and patterns, which can be very useful in forming the adequate context for the misinformation. Third, Most of the former methods perform poorly in early detection accuracy measures because of their reliance on features that are usually absent at the initial stage of news or social media posts on social networks. In this dissertation, we propose suitable deep neural attention based solutions to overcome these limitations. For instance, we propose a claim verification model, which learns embddings for the latent aspects such as author and subject of the claim and domain of the external evidence document. This enables the model to learn important additional context other than the textual content. In addition, we also propose an algorithm to extract evidential snippets out of external evidence documents, which serves as explanation of the model’s decisions. Next, we improve this model by using improved claim driven attention mechanism and also generate a topically diverse and non-redundant multi-document fact-checking summary for the claims, which helps to further interpret the model’s decision making. Subsequently, we introduce a novel method to learn influence and affinity relationships among the social media users present on the propagation paths of the news items. By modeling the complex influence relationship among the users, in addition to textual content, we learn the significant patterns pertaining to the diffusion of the news item on social network. The evaluation shows that the proposed model outperforms the other related methods in early detection performance with significant gains. Next, we propose a synthetic headline generation based headline incongruence detection model. Which uses a word-to-word mutual attention based deep semantic matching between original and synthetic news headline to detect incongruence. Further, we investigate and define a new task of incongruence detection in presence of important cardinal values in headline. For this new task, we propose a part-of-speech pattern driven attention based method, which learns requisite context for cardinal values
    • …
    corecore