20 research outputs found

    TOPICAL EXPRESSIVITY IN SHORT TEXTS

    Get PDF
    With each passing minute, online data is growing exponentially. A bulk of such data is generated from short text social media platforms such as Twitter. Such platforms are fundamental in social media knowledge-based applications like recommender systems. Twitter, for example, provides rich real-time streaming information. Extracting knowledge from such short texts without automated support is not feasible due to Twitter\u27s platform streaming nature. Therefore, an automated method for comprehending patterns in such text is a need for many knowledge systems. This paper provides solutions to generate topics from Twitter data. We present several techniques related to topical modelling to identify topics of interest in short texts. Topic modelling is inherently problematic in shorter texts with very sparse vocabulary in addition to the informal language used in their dissemination. Such findings are informative in knowledge extraction for social media-based recommender systems as well as in understanding tweeters over time

    Iterative Matrix Factorization Method for Social Media Data Location Prediction

    Get PDF
    Since some of the location of where the users posted their tweets collected by social media company have varied accuracy, and some are missing. We want to use those tweets with highest accuracy to help fill in the data of those tweets with incomplete information. To test our algorithm, we used the sets of social media data from a city, we separated them into training sets, where we know all the information, and the testing sets, where we intentionally pretend to not know the location. One prediction method that was used in (Dukler, Han and Wang, 2016) requires appending one-hot encoding of the location to the bag of words matrix to do Location Oriented Nonnegative Matrix Factorization (LONMF). We improve further on this algorithm by introducing iterative LONMF. We found that when the threshold and number of iterations are chosen correctly, we can predict tweets location with higher accuracy than using LONMF

    On the relation between message sentiment and its virality on social media

    Get PDF
    We investigate the relation between the sentiment of a message on social media and its virality, defined as the volume and speed of message diffusion. We analyze 4.1 million messages (tweets) obtained from Twitter. Although factors affecting message diffusion on social media have been studied previously, we focus on message sentiment and reveal how the polarity of message sentiment affects its virality. The virality of a message is characterized by the number of message repostings (retweets) and the time elapsed from the original posting of a message to its Nth reposting (N-retweet time). Through extensive analysis using the 4.1 million tweets and their retweets in 1 week, we discover that negative messages are likely to be reposted more rapidly and frequently than positive and neutral messages. Specifically, the reposting volume of negative messages is 20–60% higher than that of positive and neutral messages, and negative messages spread 25% faster than positive and neutral messages when the diffusion volume is quite high. We also perform longitudinal analysis of message diffusion observed over 1 year and find that recurrent diffusion of negative messages is less frequent than that of positive and neutral messages. Moreover, we present a simple message diffusion model that can reproduce the characteristics of message diffusion observed in this paper
    corecore