373 research outputs found
A Human Word Association based model for topic detection in social networks
With the widespread use of social networks, detecting the topics discussed in
these networks has become a significant challenge. The current works are mainly
based on frequent pattern mining or semantic relations, and the language
structure is not considered. The meaning of language structural methods is to
discover the relationship between words and how humans understand them.
Therefore, this paper uses the Concept of the Imitation of the Mental Ability
of Word Association to propose a topic detection framework in social networks.
This framework is based on the Human Word Association method. A special
extraction algorithm has also been designed for this purpose. The performance
of this method is evaluated on the FA-CUP dataset. It is a benchmark dataset in
the field of topic detection. The results show that the proposed method is a
good improvement compared to other methods, based on the Topic-recall and the
keyword F1 measure. Also, most of the previous works in the field of topic
detection are limited to the English language, and the Persian language,
especially microblogs written in this language, is considered a low-resource
language. Therefore, a data set of Telegram posts in the Farsi language has
been collected. Applying the proposed method to this dataset also shows that
this method works better than other topic detection methods
VIRAL TOPIC PREDICTION AND DESCRIPTION IN MICROBLOG SOCIAL NETWORKS
Ph.DDOCTOR OF PHILOSOPH
Event detection from social network streams using frequent pattern mining with dynamic support values
Detecting events from streams of data is challenging due to the characteristics of such streams: data elements arrive in real-time and at high velocity, and the size of the streams is typically unbounded while it is not possible to backtrack over past data elements or maintain and review the entire history. Social networks are a good source for event identification as they generate huge amount of timely information representing what users are posting and discussing. In this research, we are developing methods for event detection from streams of data. More specifically, we are presenting a framework for detecting the daily occurring events or topics occurring in social network streams related to major events. Our approach utilizes the Frequent Pattern Mining method to detect the daily occurring frequent patterns, which are going to be our detected events. In addition, we propose a dynamic support definition method to replace the fixed given one. An experiment was run on two streams relating to two different major events to examine the detected events and to test our support definition method. The UK General Elections 2015 stream holds more than one million tweets, and the Greece Crisis 2015 stream contains more than 150k tweets. The detected events were evaluated against news headlines published the same day the event was found. The results revealed that the higher the streaming level (bigger window size), the more accurate the detected events. We also show that for too small sized windows, a more strict support definition method is needed to avoid detecting false or insignificant events
PREDICTION IN SOCIAL MEDIA FOR MONITORING AND RECOMMENDATION
Social media including blogs and microblogs provide a rich window into user online activity. Monitoring social media datasets can be expensive due to the scale and inherent noise in such data streams. Monitoring and prediction can provide significant benefit for many applications including brand monitoring and making recommendations. Consider a focal topic and posts on multiple blog channels on this topic. Being able to target a few potentially influential blog channels which will contain relevant posts is valuable. Once these channels have been identified, a user can proactively join the conversation themselves to encourage positive word-of-mouth and to mitigate negative word-of-mouth.
Links between different blog channels, and retweets and mentions between different microblog users, are a proxy of information flow and influence. When trying to monitor where information will flow and who will be influenced by a focal user, it is valuable to predict future links, retweets and mentions. Predictions of users who will post on a focal topic or who will be influenced by a focal user can yield valuable recommendations.
In this thesis we address the problem of prediction in social media to select social media channels for monitoring and recommendation. Our analysis focuses on individual authors and linkers. We address a series of prediction problems including future author prediction problem and future link prediction problem in the blogosphere, as well as prediction in microblogs such as twitter.
For the future author prediction in the blogosphere, where there are network properties and content properties, we develop prediction methods inspired by information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem, considering both network properties and content properties. We identify a number of features which have impact on prediction accuracy. For the future link prediction in the blogosphere, we compare multiple link prediction methods, and show that our proposed solution which combines the network properties of the blog with content properties does better than methods which examine network properties or content properties in isolation. Most of the previous work has only looked at either one or the other. For the prediction in microblogs, where there are follower network, retweet network, and mention network, we propose a prediction model to utilize the hybrid network for prediction. In this model, we define a potential function that reflects the likelihood of a candidate user having a specific type of link to a focal user in the future and identify an optimization problem by the principle of maximum likelihood to determine the parameters in the model. We propose different approximate approaches based on the prediction model. Our approaches are demonstrated to outperform the baseline methods which only consider one network or utilize hybrid networks in a naive way. The prediction model can be applied to other similar problems where hybrid networks exist
Persian topic detection based on Human Word association and graph embedding
In this paper, we propose a framework to detect topics in social media based
on Human Word Association. Identifying topics discussed in these media has
become a critical and significant challenge. Most of the work done in this area
is in English, but much has been done in the Persian language, especially
microblogs written in Persian. Also, the existing works focused more on
exploring frequent patterns or semantic relationships and ignored the
structural methods of language. In this paper, a topic detection framework
using HWA, a method for Human Word Association, is proposed. This method uses
the concept of imitation of mental ability for word association. This method
also calculates the Associative Gravity Force that shows how words are related.
Using this parameter, a graph can be generated. The topics can be extracted by
embedding this graph and using clustering methods. This approach has been
applied to a Persian language dataset collected from Telegram. Several
experimental studies have been performed to evaluate the proposed framework's
performance. Experimental results show that this approach works better than
other topic detection methods
Adaptive Method for Following Dynamic Topics on Twitter
Many research social studies of public response on social media require following (i.e., tracking) topics on Twitter for long periods of time. The current approaches rely on streaming tweets based on some hashtags or keywords, or following some Twitter accounts. Such approaches lead to limited coverage of on-topic tweets. In this paper, we introduce a novel technique for following such topics in a more effective way. A topic is defined as a set of well-prepared queries that cover the static side of the topic. We propose an automatic approach that adapts to emerging aspects of a tracked broad topic over time. We tested our tracking approach on three broad dynamic topics that are hot in different categories: Egyptian politics, Syrian conflict, and international sports. We measured the effectiveness of our approach over four full days spanning a period of four months to ensure consistency in effectiveness. Experimental results showed that, on average, our approach achieved over 100 % increase in recall relative to the baseline Boolean approach, while maintaining an acceptable precision of 83%
- …