283 research outputs found

    Multimodal news article analysis

    Get PDF
    The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.Peer ReviewedPostprint (author's final draft

    Shallow reading with Deep Learning: Predicting popularity of online content using only its title

    Full text link
    With the ever decreasing attention span of contemporary Internet users, the title of online content (such as a news article or video) can be a major factor in determining its popularity. To take advantage of this phenomenon, we propose a new method based on a bidirectional Long Short-Term Memory (LSTM) neural network designed to predict the popularity of online content using only its title. We evaluate the proposed architecture on two distinct datasets of news articles and news videos distributed in social media that contain over 40,000 samples in total. On those datasets, our approach improves the performance over traditional shallow approaches by a margin of 15%. Additionally, we show that using pre-trained word vectors in the embedding layer improves the results of LSTM models, especially when the training set is small. To our knowledge, this is the first attempt of applying popularity prediction using only textual information from the title

    Breakingnews: article annotation by image and text processing

    Get PDF
    © 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of Computer Vision and Natural Language Processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of news articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce an adaptive CNN architecture that shares most of the structure for multiple tasks including source detection, article illustration and geolocation of articles. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and user comments). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.Peer ReviewedPostprint (author's final draft

    Multimodal Geolocation Estimation of News Photos

    Get PDF
    The widespread growth of multimodal news requires sophisticated approaches to interpret content and relations of different modalities. Images are of utmost importance since they represent a visual gist of the whole news article. For example, it is essential to identify the locations of natural disasters for crisis management or to analyze political or social events across the world. In some cases, verifying the location(s) claimed in a news article might help human assessors or fact-checking efforts to detect misinformation, i.e., fake news. Existing methods for geolocation estimation typically consider only a single modality, e.g., images or text. However, news images can lack sufficient geographical cues to estimate their locations, and the text can refer to various possible locations. In this paper, we propose a novel multimodal approach to predict the geolocation of news photos. To enable this approach, we introduce a novel dataset called Multimodal Geolocation Estimation of News Photos (MMG-NewsPhoto). MMG-NewsPhoto is, so far, the largest dataset for the given task and contains more than half a million news texts with the corresponding image, out of which 3000 photos were manually labeled for the photo geolocation based on information from the image-text pairs. For a fair comparison, we optimize and assess state-of-the-art methods using the new benchmark dataset. Experimental results show the superiority of the multimodal models compared to the unimodal approaches

    Recurrent Neural Networks for Online Video Popularity Prediction

    Full text link
    In this paper, we address the problem of popularity prediction of online videos shared in social media. We prove that this challenging task can be approached using recently proposed deep neural network architectures. We cast the popularity prediction problem as a classification task and we aim to solve it using only visual cues extracted from videos. To that end, we propose a new method based on a Long-term Recurrent Convolutional Network (LRCN) that incorporates the sequentiality of the information in the model. Results obtained on a dataset of over 37'000 videos published on Facebook show that using our method leads to over 30% improvement in prediction performance over the traditional shallow approaches and can provide valuable insights for content creators

    What Trends in Chinese Social Media

    Full text link
    There has been a tremendous rise in the growth of online social networks all over the world in recent times. While some networks like Twitter and Facebook have been well documented, the popular Chinese microblogging social network Sina Weibo has not been studied. In this work, we examine the key topics that trend on Sina Weibo and contrast them with our observations on Twitter. We find that there is a vast difference in the content shared in China, when compared to a global social network such as Twitter. In China, the trends are created almost entirely due to retweets of media content such as jokes, images and videos, whereas on Twitter, the trends tend to have more to do with current global events and news stories

    The applications of social media in sports marketing

    Get PDF
    n the era of big data, sports consumer's activities in social media become valuable assets to sports marketers. In this paper, the authors review extant literature regarding how to effectively use social media to promote sports as well as how to effectively analyze social media data to support business decisions. Methods: The literature review method. Results: Our findings suggest that sports marketers can use social media to achieve the following goals, such as facilitating marketing communication campaigns, adding values to sports products and services, creating a two-way communication between sports brands and consumers, supporting sports sponsorship program, and forging brand communities. As to how to effectively analyze social media data to support business decisions, extent literature suggests that sports marketers to undertake traffic and engagement analysis on their social media sites as well as to conduct sentiment analysis to probe customer's opinions. These insights can support various aspects of business decisions, such as marketing communication management, consumer's voice probing, and sales predictions. Conclusion: Social media are ubiquitous in the sports marketing and consumption practices. In the era of big data, these "footprints" can now be effectively analyzed to generate insights to support business decisions. Recommendations to both the sports marketing practices and research are also addressed

    Facts and Fabrications about Ebola: A Twitter Based Study

    Full text link
    Microblogging websites like Twitter have been shown to be immensely useful for spreading information on a global scale within seconds. The detrimental effect, however, of such platforms is that misinformation and rumors are also as likely to spread on the network as credible, verified information. From a public health standpoint, the spread of misinformation creates unnecessary panic for the public. We recently witnessed several such scenarios during the outbreak of Ebola in 2014 [14, 1]. In order to effectively counter the medical misinformation in a timely manner, our goal here is to study the nature of such misinformation and rumors in the United States during fall 2014 when a handful of Ebola cases were confirmed in North America. It is a well known convention on Twitter to use hashtags to give context to a Twitter message (a tweet). In this study, we collected approximately 47M tweets from the Twitter streaming API related to Ebola. Based on hashtags, we propose a method to classify the tweets into two sets: credible and speculative. We analyze these two sets and study how they differ in terms of a number of features extracted from the Twitter API. In conclusion, we infer several interesting differences between the two sets. We outline further potential directions to using this material for monitoring and separating speculative tweets from credible ones, to enable improved public health information.Comment: Appears in SIGKDD BigCHat Workshop 201

    Implementasi Naive Bayes pada Analisis Sentimen Opini Masyarakat di Twitter Terhadap Kondisi New Normal di Indonesia

    Get PDF
    Pandemi yang melanda dunia saat ini menyebabkan masyarakat perlu beradaptasi dalam melaksanakan aktivitas sehari-hari atau yang dikenal dengan istilah kondisi “New Normal”. Analisis sentimen dibutuhkan agar pemerintah dapat mengetahui respon masyarakat terhadap kebijakan yang dikeluarkan dalam menanggulangi penularan Covid-19. Pada penelitian ini akan diimplementasikan algoritma Naive Bayes dan dipadukan dengan metode stemming Sastrawi. Terdapat beberapa proses yang dilalui dimulai dari data crawling untuk mengumpulkan dataset. Setelah dataset terkumpul kemudian melalui proses data preprocessing, feature extraction, dan klasifikasi menggunakan Naive Bayes. Pengujian algoritma menggunakan confussion matrix dengan memperhatikan nilai akurasi, precision, dan recall. Hasil pengujian terhadap ratio data training dan testing diperoleh ratio terbaik 70% dan 30% dengan nilai accuracy, precision, dan recall secara berturut-turut sebesar 94.55%, 93.55%, dan 93.55%
    • …
    corecore