Search CORE

655,225 research outputs found

Analyzing the Language of Food on Social Media

Author: Bell Dane
Fried Daniel
Hingle Melanie
Kobourov Stephen
Surdeanu Mihai
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/09/2014
Field of study

We investigate the predictive power behind the language of food on social media. We collect a corpus of over three million food-related posts from Twitter and demonstrate that many latent population characteristics can be directly predicted from this data: overweight rate, diabetes rate, political leaning, and home geographical location of authors. For all tasks, our language-based models significantly outperform the majority-class baselines. Performance is further improved with more complex natural language processing, such as topic modeling. We analyze which textual features have most predictive power for these datasets, providing insight into the connections between the language of food, geographic locale, and community characteristics. Lastly, we design and implement an online system for real-time query and visualization of the dataset. Visualization tools, such as geo-referenced heatmaps, semantics-preserving wordclouds and temporal histograms, allow us to discover more complex, global patterns mirrored in the language of food.Comment: An extended abstract of this paper will appear in IEEE Big Data 201

arXiv.org e-Print Archive

Crossref

A Social Citizen Dashboard for Participatory Urban Planning in Berlin: Prototype and Evaluation

Author: Fürstenau Daniel
Meindl Kristina
Morelli Flavio
Rabe Jochen
Schulte-Althoff Matthias
Publication venue: AIS Electronic Library (AISeL)
Publication date: 04/01/2021
Field of study

Participatory urban planning enables citizens to make their voices heard in the urban planning process. The resulting measures are more likely to be accepted by the community. However, the parti-cipation process becomes more effortful and time-consuming. New approaches have been developed using digital technologies to facilitate citizen participation, such as topic modeling based on social media. Using Twitter data for the city of Berlin, we explore how social media and topic modeling can be used to classify and analyze citizen opinions. We develop a Social Citizen Dashboard allowing for a better understanding of changes in citizens’ priorities and incorporating constant cycles of feedback throughout planning phases. Evaluation interviews indicate the dashboard’s potential usefulness and implications as well as point to limitation in data quality and spur further research potentials

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Topic Modeling in the News Document on Sustainable Development Goals

Author: Fitri Hidayatul
Soesanti Indah
Widyawan Widyawan
Publication venue: 'Universitas Gadjah Mada'
Publication date: 30/09/2021
Field of study

Indonesia is a developing country and supports the program of the Sustainable Development Goals (SDGs) which consist of 17 goals. SDGs is not only the government’s duty, but a shared duty from any elements. Online media has a crucial role in implementing goals of Indonesia’s SDG. Information published in online news related to the SDGs is an important consideration for the government, society, and all elements. Categorizing news manually to find out news topics is very time-consuming and done by the ability of news editors. News presented by online media on the news site can be used as topic modeling, where hidden topics can be found in the news on online media. Topic modeling will classify data based on a particular topic and determine the relationship between text. Latent Dirichlet allocation (LDA) is one of the methods on topic modeling to find out the trend of topics of SDGs news. Based on the result of this research, the implementation of LDA is the right choice for finding topics in a document. The result of topic modeling with k = 17 obtained the highest coherence score of 0.5405 on topic 8. Topic 8 discussed news related to the eighth SDGs goals, namely decent work and economic growth. This categorization was based on words formed after the LDA process. Then, topic 5 discussed the news on the 17th SDGs goals, namely partnerships for the goals. Topic 6 discussed the news of the first SDGs, namely no poverty

IJITEE (International Journal of Information Technology and Electrical Engineering)

How did the discussion go: Discourse act classification in social media conversations

Author: B O’Connor
J Bollen
K Scott
Mirko Lai
ML Larson
S Bhatia
S Hochreiter
S Hochreiter
Subhabrata Dutta
T Chakraborty
V Eisenlauer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/08/2018
Field of study

We propose a novel attention based hierarchical LSTM model to classify discourse act sequences in social media conversations, aimed at mining data from online discussion using textual meanings beyond sentence level. The very uniqueness of the task is the complete categorization of possible pragmatic roles in informal textual discussions, contrary to extraction of question-answers, stance detection or sarcasm identification which are very much role specific tasks. Early attempt was made on a Reddit discussion dataset. We train our model on the same data, and present test results on two different datasets, one from Reddit and one from Facebook. Our proposed model outperformed the previous one in terms of domain independence; without using platform-dependent structural features, our hierarchical LSTM with word relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively to predict discourse roles of comments in Reddit and Facebook discussions. Efficiency of recurrent and convolutional architectures in order to learn discursive representation on the same task has been presented and analyzed, with different word and comment embedding schemes. Our attention mechanism enables us to inquire into relevance ordering of text segments according to their roles in discourse. We present a human annotator experiment to unveil important observations about modeling and data annotation. Equipped with our text-based discourse identification model, we inquire into how heterogeneous non-textual features like location, time, leaning of information etc. play their roles in charaterizing online discussions on Facebook

arXiv.org e-Print Archive

Crossref

Landslide Detection in Real-Time Social Media Image Streams

Author: Banks Vanessa J.
Bossu Remy
Imran Muhammad
Ofli Ferda
Pennington Catherine
Qazi Umair
Roch Julien
Publication venue
Publication date: 03/10/2021
Field of study

Lack of global data inventories obstructs scientific modeling of and response to landslide hazards which are oftentimes deadly and costly. To remedy this limitation, new approaches suggest solutions based on citizen science that requires active participation. However, as a non-traditional data source, social media has been increasingly used in many disaster response and management studies in recent years. Inspired by this trend, we propose to capitalize on social media data to mine landslide-related information automatically with the help of artificial intelligence (AI) techniques. Specifically, we develop a state-of-the-art computer vision model to detect landslides in social media image streams in real time. To that end, we create a large landslide image dataset labeled by experts and conduct extensive model training experiments. The experimental results indicate that the proposed model can be deployed in an online fashion to support global landslide susceptibility maps and emergency response

arXiv.org e-Print Archive

Recommended from our members

Semantics-Space-Time Cube. A Conceptual Framework for Systematic Analysis of Texts in Space and Time

Author: Andrienko G.
Andrienko N.
Chen S.
Chen W.
Li J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2018
Field of study

We propose an approach to analyzing data in which texts are associated with spatial and temporal references with the aim to understand how the text semantics vary over space and time. To represent the semantics, we apply probabilistic topic modeling. After extracting a set of topics and representing the texts by vectors of topic weights, we aggregate the data into a data cube with the dimensions corresponding to the set of topics, the set of spatial locations (e.g., regions), and the time divided into suitable intervals according to the scale of the planned analysis. Each cube cell corresponds to a combination (topic, location, time interval) and contains aggregate measures characterizing the subset of the texts concerning this topic and having the spatial and temporal references within these location and interval. Based on this structure, we systematically describe the space of analysis tasks on exploring the interrelationships among the three heterogeneous information facets, semantics, space, and time. We introduce the operations of projecting and slicing the cube, which are used to decompose complex tasks into simpler subtasks. We then present a design of a visual analytics system intended to support these subtasks. To reduce the complexity of the user interface, we apply the principles of structural, visual, and operational uniformity while respecting the specific properties of each facet. The aggregated data are represented in three parallel views corresponding to the three facets and providing different complementary perspectives on the data. The views have similar look-and-feel to the extent allowed by the facet specifics. Uniform interactive operations applicable to any view support establishing links between the facets. The uniformity principle is also applied in supporting the projecting and slicing operations on the data cube. We evaluate the feasibility and utility of the approach by applying it in two analysis scenarios using geolocated social media data for studying people's reactions to social and natural events of different spatial and temporal scales

City Research Online

Crossref

Fraunhofer-ePrints