69,646 research outputs found

    PREDICTION IN SOCIAL MEDIA FOR MONITORING AND RECOMMENDATION

    Get PDF
    Social media including blogs and microblogs provide a rich window into user online activity. Monitoring social media datasets can be expensive due to the scale and inherent noise in such data streams. Monitoring and prediction can provide significant benefit for many applications including brand monitoring and making recommendations. Consider a focal topic and posts on multiple blog channels on this topic. Being able to target a few potentially influential blog channels which will contain relevant posts is valuable. Once these channels have been identified, a user can proactively join the conversation themselves to encourage positive word-of-mouth and to mitigate negative word-of-mouth. Links between different blog channels, and retweets and mentions between different microblog users, are a proxy of information flow and influence. When trying to monitor where information will flow and who will be influenced by a focal user, it is valuable to predict future links, retweets and mentions. Predictions of users who will post on a focal topic or who will be influenced by a focal user can yield valuable recommendations. In this thesis we address the problem of prediction in social media to select social media channels for monitoring and recommendation. Our analysis focuses on individual authors and linkers. We address a series of prediction problems including future author prediction problem and future link prediction problem in the blogosphere, as well as prediction in microblogs such as twitter. For the future author prediction in the blogosphere, where there are network properties and content properties, we develop prediction methods inspired by information retrieval approaches that use historical posts in the blog channel for prediction. We also train a ranking support vector machine (SVM) to solve the problem, considering both network properties and content properties. We identify a number of features which have impact on prediction accuracy. For the future link prediction in the blogosphere, we compare multiple link prediction methods, and show that our proposed solution which combines the network properties of the blog with content properties does better than methods which examine network properties or content properties in isolation. Most of the previous work has only looked at either one or the other. For the prediction in microblogs, where there are follower network, retweet network, and mention network, we propose a prediction model to utilize the hybrid network for prediction. In this model, we define a potential function that reflects the likelihood of a candidate user having a specific type of link to a focal user in the future and identify an optimization problem by the principle of maximum likelihood to determine the parameters in the model. We propose different approximate approaches based on the prediction model. Our approaches are demonstrated to outperform the baseline methods which only consider one network or utilize hybrid networks in a naive way. The prediction model can be applied to other similar problems where hybrid networks exist

    Coauthor prediction for junior researchers

    Get PDF
    Research collaboration can bring in different perspectives and generate more productive results. However, finding an appropriate collaborator can be difficult due to the lacking of sufficient information. Link prediction is a related technique for collaborator discovery; but its focus has been mostly on the core authors who have relatively more publications. We argue that junior researchers actually need more help in finding collaborators. Thus, in this paper, we focus on coauthor prediction for junior researchers. Most of the previous works on coauthor prediction considered global network feature and local network feature separately, or tried to combine local network feature and content feature. But we found a significant improvement by simply combing local network feature and global network feature. We further developed a regularization based approach to incorporate multiple features simultaneously. Experimental results demonstrated that this approach outperformed the simple linear combination of multiple features. We further showed that content features, which were proved to be useful in link prediction, can be easily integrated into our regularization approach. © 2013 Springer-Verlag

    Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora

    Full text link
    Much of scientific progress stems from previously published findings, but searching through the vast sea of scientific publications is difficult. We often rely on metrics of scholarly authority to find the prominent authors but these authority indices do not differentiate authority based on research topics. We present Latent Topical-Authority Indexing (LTAI) for jointly modeling the topics, citations, and topical authority in a corpus of academic papers. Compared to previous models, LTAI differs in two main aspects. First, it explicitly models the generative process of the citations, rather than treating the citations as given. Second, it models each author's influence on citations of a paper based on the topics of the cited papers, as well as the citing papers. We fit LTAI to four academic corpora: CORA, Arxiv Physics, PNAS, and Citeseer. We compare the performance of LTAI against various baselines, starting with the latent Dirichlet allocation, to the more advanced models including author-link topic model and dynamic author citation topic model. The results show that LTAI achieves improved accuracy over other similar models when predicting words, citations and authors of publications.Comment: Accepted by Transactions of the Association for Computational Linguistics (TACL); to appea

    On the Predictability of Talk Attendance at Academic Conferences

    Full text link
    This paper focuses on the prediction of real-world talk attendances at academic conferences with respect to different influence factors. We study the predictability of talk attendances using real-world tracked face-to-face contacts. Furthermore, we investigate and discuss the predictive power of user interests extracted from the users' previous publications. We apply Hybrid Rooted PageRank, a state-of-the-art unsupervised machine learning method that combines information from different sources. Using this method, we analyze and discuss the predictive power of contact and interest networks separately and in combination. We find that contact and similarity networks achieve comparable results, and that combinations of different networks can only to a limited extend help to improve the prediction quality. For our experiments, we analyze the predictability of talk attendance at the ACM Conference on Hypertext and Hypermedia 2011 collected using the conference management system Conferator
    corecore