640 research outputs found

    Time-aware topic recommendation based on micro-blogs

    Get PDF
    Topic recommendation can help users deal with the information overload issue in micro-blogging communities. This paper proposes to use the implicit information network formed by the multiple relationships among users, topics and micro-blogs, and the temporal information of micro-blogs to find semantically and temporally relevant topics of each topic, and to profile users' time-drifting topic interests. The Content based, Nearest Neighborhood based and Matrix Factorization models are used to make personalized recommendations. The effectiveness of the proposed approaches is demonstrated in the experiments conducted on a real world dataset that collected from Twitter.com

    Real-time detection and sorting of news on microblogging platforms

    Get PDF
    Due to the increasing popularity of microblog-ging platforms (e.g., Twitter), detecting real-Time news from microblogs (e.g., tweets) has recently drawn a lot of attention. Most of the previous work on this subject detect news by analyzing propagation patterns of microblogs. This approach has two limitations: (i) many non-news microblogs (e.g. marketing activi-Ties) have propagation patterns similar to news microblogs and therefore they can be false-ly reported as news; (ii) using propagation patterns to identify news involves a time de-lay until the pattern is formed, therefore news are not detected in real time. We propose an alternative approach, which, motivated by the necessity of real-Time detection of news, does not rely on propagation of posts. More-over, we propose a real-Time sorting strategy that orders the detected news microblogs us-ing a translational approach. An experimen-Tal evaluation on a large-scale microblogging dataset demonstrates the effectiveness of our approach.postprin

    A novel data analytic model for mining user insurance demands from microblogs

    Get PDF
    This paper proposes a method based on LDA model and Word2Vec for analyzing Microblog users' insurance demands. First of all, we use LDA model to analyze the text data of Microblog user to get their candidate topic. Secondly, we use CBOW model to implement topic word vectorization and use word similarity calculation to expand it. Then we use K-means model to cluster the expanded words and redefine the topic category. Then we use the LDA model to extract the keywords of various insurance information on the “Pingan Insurance” website and analyze the possibility of users with different demands to purchase various types of insurance with the help of word vector similarity. Finally, the validity of the method in this paper is verified against Microblog user information. The experimental results show that the accuracy, recall rate and F1 value of the LDA-CBOW extending method have been proposed compared with that of the traditional LDA model, respectively, which proves the feasibility of this method. The results of this paper will help insurance companies to accurately grasp the preferences of Microblog users, understand the potential insurance needs of users timely, and lay a foundation for personalized recommendation of insurance products

    Hashtag biased ranking for keyword extraction from microblog posts

    Full text link
    © Springer International Publishing Switzerland 2015. Nowadays, a huge amount of text is being generated for social networking purpose on the Web. Keyword extraction from such text benefit many applications such as advertising, search, and content filtering. Recent studies show that graph based ranking is more effective than traditional term or document frequecy based approaches. However, most work in the literature constructs word to word graph within a document or a collection of documents before applying a kind of random walk. Such a graph does not consider the influence of document importance on keyword extraction. Moreover, social text like a microblog post usually has speical social features such as hashtag and so on, which can help us understand its topic. In this paper, we propose hashtag biased ranking for keyword extraction from a collection of microblog posts. We first build a word-post weighted graph by taking into account the posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides our approach to extract keywords according to the hashtag topic. Last, the final ranking of a word is determined by the stationary probability after a number of interations. We evaluate our proposed method on a real Chinese microblog posts. Experiments show that our method is more effective than the traditional word to word graph based ranking in terms of precision

    PREDICTION IN SOCIAL MEDIA FOR MONITORING AND RECOMMENDATION

    Get PDF
    Social media including blogs and microblogs provide a rich window into user online activity. Monitoring social media datasets can be expensive due to the scale and inherent noise in such data streams. Monitoring and prediction can provide significant benefit for many applications including brand monitoring and making recommendations. Consider a focal topic and posts on multiple blog channels on this topic. Being able to target a few potentially influential blog channels which will contain relevant posts is valuable. Once these channels have been identified, a user can proactively join the conversation themselves to encourage positive word-of-mouth and to mitigate negative word-of-mouth. Links between different blog channels, and retweets and mentions between different microblog users, are a proxy of information flow and influence. When trying to monitor where information will flow and who will be influenced by a focal user, it is valuable to predict future links, retweets and mentions. Predictions of users who will post on a focal topic or who will be influenced by a focal user can yield valuable recommendations. In this thesis we address the problem of prediction in social media to select social media channels for monitoring and recommendation. Our analysis focuses on individual authors and linkers. We address a series of prediction problems including future author prediction problem and future link prediction problem in the blogosphere, as well as prediction in microblogs such as twitter. For the future author prediction in the blogosphere, where there are network properties and content properties, we develop prediction methods inspired by information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem, considering both network properties and content properties. We identify a number of features which have impact on prediction accuracy. For the future link prediction in the blogosphere, we compare multiple link prediction methods, and show that our proposed solution which combines the network properties of the blog with content properties does better than methods which examine network properties or content properties in isolation. Most of the previous work has only looked at either one or the other. For the prediction in microblogs, where there are follower network, retweet network, and mention network, we propose a prediction model to utilize the hybrid network for prediction. In this model, we define a potential function that reflects the likelihood of a candidate user having a specific type of link to a focal user in the future and identify an optimization problem by the principle of maximum likelihood to determine the parameters in the model. We propose different approximate approaches based on the prediction model. Our approaches are demonstrated to outperform the baseline methods which only consider one network or utilize hybrid networks in a naive way. The prediction model can be applied to other similar problems where hybrid networks exist
    corecore