283 research outputs found
Multimodal news article analysis
The intersection of Computer Vision and Natural Language Processing has been a hot topic of research in recent years, with results that were unthinkable only a few years ago. In view of this progress, we want to highlight online news articles as a potential next step for this area of research. The rich interrelations of text, tags, images or videos, as well as a vast corpus of general knowledge are an exciting benchmark for high-capacity models such as the deep neural networks. In this paper we present a series of tasks and baseline approaches to leverage corpus such as the BreakingNews dataset.Peer ReviewedPostprint (author's final draft
Shallow reading with Deep Learning: Predicting popularity of online content using only its title
With the ever decreasing attention span of contemporary Internet users, the
title of online content (such as a news article or video) can be a major factor
in determining its popularity. To take advantage of this phenomenon, we propose
a new method based on a bidirectional Long Short-Term Memory (LSTM) neural
network designed to predict the popularity of online content using only its
title. We evaluate the proposed architecture on two distinct datasets of news
articles and news videos distributed in social media that contain over 40,000
samples in total. On those datasets, our approach improves the performance over
traditional shallow approaches by a margin of 15%. Additionally, we show that
using pre-trained word vectors in the embedding layer improves the results of
LSTM models, especially when the training set is small. To our knowledge, this
is the first attempt of applying popularity prediction using only textual
information from the title
Breakingnews: article annotation by image and text processing
© 20xx IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Building upon recent Deep Neural Network architectures, current approaches lying in the intersection of Computer Vision and Natural Language Processing have achieved unprecedented breakthroughs in tasks like automatic captioning or image retrieval. Most of these learning methods, though, rely on large training sets of images associated with human annotations that specifically describe the visual content. In this paper we propose to go a step further and explore the more complex cases where textual descriptions are loosely related to the images. We focus on the particular domain of news articles in which the textual content often expresses connotative and ambiguous relations that are only suggested but not directly inferred from images. We introduce an adaptive CNN architecture that shares most of the structure for multiple tasks including source detection, article illustration and geolocation of articles. Deep Canonical Correlation Analysis is deployed for article illustration, and a new loss function based on Great Circle Distance is proposed for geolocation. Furthermore, we present BreakingNews, a novel dataset with approximately 100K news articles including images, text and captions, and enriched with heterogeneous meta-data (such as GPS coordinates and user comments). We show this dataset to be appropriate to explore all aforementioned problems, for which we provide a baseline performance using various Deep Learning architectures, and different representations of the textual and visual features. We report very promising results and bring to light several limitations of current state-of-the-art in this kind of domain, which we hope will help spur progress in the field.Peer ReviewedPostprint (author's final draft
Multimodal Geolocation Estimation of News Photos
The widespread growth of multimodal news requires sophisticated approaches to interpret content and relations of different modalities. Images are of utmost importance since they represent a visual gist of the whole news article. For example, it is essential to identify the locations of natural disasters for crisis management or to analyze political or social events across the world. In some cases, verifying the location(s) claimed in a news article might help human assessors or fact-checking efforts to detect misinformation, i.e., fake news. Existing methods for geolocation estimation typically consider only a single modality, e.g., images or text. However, news images can lack sufficient geographical cues to estimate their locations, and the text can refer to various possible locations. In this paper, we propose a novel multimodal approach to predict the geolocation of news photos. To enable this approach, we introduce a novel dataset called Multimodal Geolocation Estimation of News Photos (MMG-NewsPhoto). MMG-NewsPhoto is, so far, the largest dataset for the given task and contains more than half a million news texts with the corresponding image, out of which 3000 photos were manually labeled for the photo geolocation based on information from the image-text pairs. For a fair comparison, we optimize and assess state-of-the-art methods using the new benchmark dataset. Experimental results show the superiority of the multimodal models compared to the unimodal approaches
Recurrent Neural Networks for Online Video Popularity Prediction
In this paper, we address the problem of popularity prediction of online
videos shared in social media. We prove that this challenging task can be
approached using recently proposed deep neural network architectures. We cast
the popularity prediction problem as a classification task and we aim to solve
it using only visual cues extracted from videos. To that end, we propose a new
method based on a Long-term Recurrent Convolutional Network (LRCN) that
incorporates the sequentiality of the information in the model. Results
obtained on a dataset of over 37'000 videos published on Facebook show that
using our method leads to over 30% improvement in prediction performance over
the traditional shallow approaches and can provide valuable insights for
content creators
What Trends in Chinese Social Media
There has been a tremendous rise in the growth of online social networks all
over the world in recent times. While some networks like Twitter and Facebook
have been well documented, the popular Chinese microblogging social network
Sina Weibo has not been studied. In this work, we examine the key topics that
trend on Sina Weibo and contrast them with our observations on Twitter. We find
that there is a vast difference in the content shared in China, when compared
to a global social network such as Twitter. In China, the trends are created
almost entirely due to retweets of media content such as jokes, images and
videos, whereas on Twitter, the trends tend to have more to do with current
global events and news stories
The applications of social media in sports marketing
n the era of big data, sports consumer's activities in social media become valuable assets to sports marketers. In this paper, the authors review extant literature regarding how to effectively use social media to promote sports as well as how to effectively analyze social media data to support business decisions. Methods: The literature review method. Results: Our findings suggest that sports marketers can use social media to achieve the following goals, such as facilitating marketing communication campaigns, adding values to sports products and services, creating a two-way communication between sports brands and consumers, supporting sports sponsorship program, and forging brand communities. As to how to effectively analyze social media data to support business decisions, extent literature suggests that sports marketers to undertake traffic and engagement analysis on their social media sites as well as to conduct sentiment analysis to probe customer's opinions. These insights can support various aspects of business decisions, such as marketing communication management, consumer's voice probing, and sales predictions. Conclusion: Social media are ubiquitous in the sports marketing and consumption practices. In the era of big data, these "footprints" can now be effectively analyzed to generate insights to support business decisions. Recommendations to both the sports marketing practices and research are also addressed
Facts and Fabrications about Ebola: A Twitter Based Study
Microblogging websites like Twitter have been shown to be immensely useful
for spreading information on a global scale within seconds. The detrimental
effect, however, of such platforms is that misinformation and rumors are also
as likely to spread on the network as credible, verified information. From a
public health standpoint, the spread of misinformation creates unnecessary
panic for the public. We recently witnessed several such scenarios during the
outbreak of Ebola in 2014 [14, 1]. In order to effectively counter the medical
misinformation in a timely manner, our goal here is to study the nature of such
misinformation and rumors in the United States during fall 2014 when a handful
of Ebola cases were confirmed in North America. It is a well known convention
on Twitter to use hashtags to give context to a Twitter message (a tweet). In
this study, we collected approximately 47M tweets from the Twitter streaming
API related to Ebola. Based on hashtags, we propose a method to classify the
tweets into two sets: credible and speculative. We analyze these two sets and
study how they differ in terms of a number of features extracted from the
Twitter API. In conclusion, we infer several interesting differences between
the two sets. We outline further potential directions to using this material
for monitoring and separating speculative tweets from credible ones, to enable
improved public health information.Comment: Appears in SIGKDD BigCHat Workshop 201
Implementasi Naive Bayes pada Analisis Sentimen Opini Masyarakat di Twitter Terhadap Kondisi New Normal di Indonesia
Pandemi yang melanda dunia saat ini menyebabkan masyarakat perlu beradaptasi dalam melaksanakan aktivitas sehari-hari atau yang dikenal dengan istilah kondisi “New Normal”. Analisis sentimen dibutuhkan agar pemerintah dapat mengetahui respon masyarakat terhadap kebijakan yang dikeluarkan dalam menanggulangi penularan Covid-19. Pada penelitian ini akan diimplementasikan algoritma Naive Bayes dan dipadukan dengan metode stemming Sastrawi. Terdapat beberapa proses yang dilalui dimulai dari data crawling untuk mengumpulkan dataset. Setelah dataset terkumpul kemudian melalui proses data preprocessing, feature extraction, dan klasifikasi menggunakan Naive Bayes. Pengujian algoritma menggunakan confussion matrix dengan memperhatikan nilai akurasi, precision, dan recall. Hasil pengujian terhadap ratio data training dan testing diperoleh ratio terbaik 70% dan 30% dengan nilai accuracy, precision, dan recall secara berturut-turut sebesar 94.55%, 93.55%, dan 93.55%
- …