92 research outputs found

    Unsupervised keyword extraction from microblog posts via hashtags

    Full text link
    ยฉ River Publishers. Nowadays, huge amounts of texts are being generated for social networking purposes on Web. Keyword extraction from such texts like microblog posts benefits many applications such as advertising, search, and content filtering. Unlike traditional web pages, a microblog post usually has some special social feature like a hashtag that is topical in nature and generated by users. Extracting keywords related to hashtags can reflect the intents of users and thus provides us better understanding on post content. In this paper, we propose a novel unsupervised keyword extraction approach for microblog posts by treating hashtags as topical indicators. Our approach consists of two hashtag enhanced algorithms. One is a topic model algorithm that infers topic distributions biased to hashtags on a collection of microblog posts. The words are ranked by their average topic probabilities. Our topic model algorithm can not only find the topics of a collection, but also extract hashtag-related keywords. The other is a random walk based algorithm. It first builds a word-post weighted graph by taking into account posts themselves. Then, a hashtag biased random walk is applied on this graph, which guides the algorithm to extract keywords according to hashtag topics. Last, the final ranking score of a word is determined by the stationary probability after a number of iterations. We evaluate our proposed approach on a collection of real Chinese microblog posts. Experiments show that our approach is more effective in terms of precision than traditional approaches considering no hashtag. The result achieved by the combination of two algorithms performs even better than each individual algorithm

    ์†Œ์…œ ๋ฏธ๋””์–ด ์† ๋ฃจ๋จธ ํƒ์ง€๋ฅผ ์œ„ํ•œ ๊ทธ๋ž˜ํ”„ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง๊ณผ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2020. 8. ๊น€์ข…๊ถŒ.Social media has been a great disseminator for new information and thoughts. Due to its accessibility of sharing information, however, social media has also become an ideal platform for propagations of rumors, fake news, and misinformation. Rumors on social media not only mislead the users of online but also affects the real world immensely. Thus, detecting the rumors and preventing their spread became an essential task. Previous learning-based rumor detection methods adopted to use contents, users, or propagation features of rumors. However, the methods are limited to represent rumor propagation as static graphs, which arent optimal for capturing the dynamic information of the rumors. In this study, we propose a novel graph convolutional networks with attention mechanism model named, Dynamic GCN, for rumor detection. We first represent rumor posts with their responsive posts as dynamic graphs. The temporal information is used to generate a sequence of graph snapshots. The representation learning on graph snapshots with attention mechanism captures both structural and temporal information of rumor spreads. The conducted experiments on two real-world datasets demonstrate that our model, Dynamic GCN, achieves superior results over the state-of-the-art models in the rumor detection task.์†Œ์…œ ๋ฏธ๋””์–ด๋Š” ๊ฐ•๋ ฅํ•œ ์ •๋ณด ์ „๋‹ฌ๋ ฅ์„ ๊ฐ€์ง„ ๋งค์ฒด๋กœ ์ƒˆ๋กœ์šด ์ •๋ณด์™€ ์ƒ๊ฐ์˜ ์ „ํŒŒ ์ฐฝ๊ตฌ์ด๋‹ค. ์†Œ์…œ ๋ฏธ๋””์–ด์˜ ํŠน์ง•์ธ ์ ‘๊ทผ์„ฑ์€ ๋•Œ๋ก  ๋ฃจ๋จธ, ๊ฐ€์งœ ๋‰ด์Šค, ์ž˜๋ชป๋œ ์ •๋ณด์˜ ์ „ํŒŒ์—์„œ๋„ ์ด์ƒ์ ์ธ ํ”Œ๋žซํผ์ด ๋œ๋‹ค. ์†Œ์…œ ๋ฏธ๋””์–ด ์† ๋ฃจ๋จธ๋Š” ์˜จ๋ผ์ธ ์‚ฌ์šฉ์ž๋ฅผ ์˜ค๋„ํ•  ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋•Œ๋ก  ํ˜„์‹ค ์„ธ๊ณ„์—๋„ ํฐ ์˜ํ–ฅ์„ ๋ฏธ์นœ๋‹ค. ๋”ฐ๋ผ์„œ, ๋ฃจ๋จธ๋ฅผ ํƒ์ง€ํ•˜๊ณ  ๊ทธ ์ „ํŒŒ๋ฅผ ๋ง‰๋Š” ๋…ธ๋ ฅ์ด ์š”๊ตฌ๋œ๋‹ค. ๊ธฐ์กด์˜ ๋ฃจ๋จธ ํƒ์ง€ ๋ฐฉ๋ฒ•์€ ๋ฃจ๋จธ์˜ ๋‚ด์šฉ, ์‚ฌ์šฉ์ž, ๋˜๋Š” ์ „ํŒŒ ๊ณผ์ •์˜ ์ •๋ณด๋ฅผ ํŠน์„ฑ์œผ๋กœ ์ด์šฉํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฐฉ๋ฒ•์€ ๋ฃจ๋จธ์˜ ์ „ํŒŒ๋ฅผ ์ •์  ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ํ•˜๋ฉฐ ๊ทธ ๊ตฌ์กฐ์  ํŠน์„ฑ์„ ์ด์šฉํ•œ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋Š” ๋ฃจ๋จธ์˜ ๋™์  ํŠน์„ฑ์„ ํฌ์ฐฉํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ํ•œ๊ณ„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๊ทธ๋ž˜ํ”„ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง (graph convolutional networks: GCN)๊ณผ ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜ (attention mechanism)์„ ํ™œ์šฉํ•œ ๋™์  ๊ทธ๋ž˜ํ”„ ํ•ฉ์„ฑ๊ณฑ ์‹ ๊ฒฝ๋ง (Dynamic GCN) ๋ฃจ๋จธ ํƒ์ง€ ๋ชจ๋ธ์„ ์ œ์‹œํ•œ๋‹ค. ๋จผ์ €, ์†Œ์…œ ๋ฏธ๋””์–ด ์† ๋ฃจ๋จธ ๊ฒŒ์‹œ๊ธ€๋“ค (posts) ๊ณผ ๊ทธ์˜ ๋‹ต์žฅ์ด ๋˜๋Š” ๊ธ€๋“ค(responsive posts)์„ ์ด์šฉํ•˜์—ฌ ๋ฃจ๋จธ์˜ ์ „ํŒŒ ๊ณผ์ •์„ ์ •์  ๊ทธ๋ž˜ํ”„๋กœ ํ‘œํ˜„ํ•œ๋‹ค. ์‹œ๊ฐ„ ์ •๋ณด๋ฅผ ํ†ตํ•ด ์ „ํŒŒ ๊ณผ์ •์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋Š” ์ •์  ๊ทธ๋ž˜ํ”„์˜ ์ง‘ํ•ฉ์ธ ๊ทธ๋ž˜ํ”„ ์Šค๋ƒ…์ˆ (graph snapshot) ์‹œํ€€์Šค (sequence)๋ฅผ ๋งŒ๋“ค๊ฒŒ ๋œ๋‹ค. ์–ดํ…์…˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ํ™œ์šฉํ•œ ๊ทธ๋ž˜ํ”„ ์Šค๋ƒ…์ˆ ํ‘œํ˜„ ํ•™์Šต์€ ๋ฃจ๋จธ ์ „ํŒŒ์˜ ๊ตฌ์กฐ์  ์‹œ๊ฐ„์  ์ •๋ณด๋ฅผ ๋ชจ๋‘ ํšจ๊ณผ์ ์œผ๋กœ ๋ฐ˜์˜ํ•œ๋‹ค. ์‹ค์ œ ํŠธ์œ„ํ„ฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ด์šฉํ•œ ์‹คํ—˜์„ ํ†ตํ•˜์—ฌ ์ œ์‹œ๋œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์ด ๋‹ค๋ฅธ ๋น„๊ต ๋ชจ๋ธ๋“ค๋ณด๋‹ค ๋†’์Œ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.Chapter I Introduction 1 Chapter II Related Work 5 2.1 Rumor Detection 5 2.2 Graph Convolutional Networks 6 2.3 Learning Sequences & Attention Mechanism 7 Chapter III Problem Definition 9 Chapter IV Dynamic GCN with Attention Mechanism 11 4.1 Snapshot Generation 13 4.2 Graph Convolutional Networks 14 4.3 Readout Layer 15 4.4 Attention Mechanism 16 4.5 Training & Prediction 17 Chapter V Experiments 18 5.1 Datasets 18 5.2 Baselines 20 5.3 Experimental Setup & Implementation Details 21 5.4 Performance Evaluations 24 5.5 Ablation Study 25 Chapter VI Conclusion 30 Bibliography 31 ์ดˆ ๋ก 40Maste

    NEEDMINING: IDENTIFYING MICRO BLOG DATA CONTAINING CUSTOMER NEEDS

    Get PDF
    The design of new products and services starts with the identification of needs of potential customers or users. Many existing methods like observations, surveys, and experiments draw upon specific efforts to elicit unsatisfied needs from individuals. At the same time, a huge amount of user-generated content in micro blogs is freely accessible at no cost. While this information is already analyzed to monitor sentiments towards existing offerings, it has not yet been tapped for the elicitation of needs. In this paper, we lay an important foundation for this endeavor: we propose a Machine Learning approach to identify those posts that do express needs. Our evaluation of tweets in the e-mobility domain demonstrates that the small share of relevant tweets can be identified with remarkable precision or recall results. Applied to huge data sets, the developed method should enable scalable need elicitation support for innovation managersโ€”across thousands of users, and thus augment the service design tool set available to him

    Stock market prediction using machine learning classifiers and social media, news

    Get PDF
    Accurate stock market prediction is of great interest to investors; however, stock markets are driven by volatile factors such as microblogs and news that make it hard to predict stock market index based on merely the historical data. The enormous stock market volatility emphasizes the need to effectively assess the role of external factors in stock prediction. Stock markets can be predicted using machine learning algorithms on information contained in social media and financial news, as this data can change investorsโ€™ behavior. In this paper, we use algorithms on social media and financial news data to discover the impact of this data on stock market prediction accuracy for ten subsequent days. For improving performance and quality of predictions, feature selection and spam tweets reduction are performed on the data sets. Moreover, we perform experiments to find such stock markets that are difficult to predict and those that are more influenced by social media and financial news. We compare results of different algorithms to find a consistent classifier. Finally, for achieving maximum prediction accuracy, deep learning is used and some classifiers are ensembled. Our experimental results show that highest prediction accuracies of 80.53% and 75.16% are achieved using social media and financial news, respectively. We also show that New York and Red Hat stock markets are hard to predict, New York and IBM stocks are more influenced by social media, while London and Microsoft stocks by financial news. Random forest classifier is found to be consistent and highest accuracy of 83.22% is achieved by its ensemble

    ACQR: A Novel Framework to Identify and Predict Influential Users in Micro-Blogging

    Get PDF
    As key roles of online social networks, influential users in micro-blogging have the ability to influence the attitudes or behaviour of others. When it comes to marketing, the usersโ€™ influence should be associated with a certain topic or field on which people have different levels of preference and expertise. In order to identify and predict influential users in a specific topic more effectively, usersโ€™ actual influential capability on a certain topic and potential influence unlimited by topics is combined into a novel comprehensive framework named โ€œACQRโ€ in this research. ACQR framework depicts the attributes of the influentials from four aspects, including activeness (A), centrality (C), quality of post (Q) and reputation (R). Based on this framework, a data mining method is developed for discovering and forecasting the top influentials. Empirical results reveal that our ACQR framework and the data mining method by TOPSIS and SVMs (with polynomial and RBF kernels) can perform very well in identifying and predicting influential users in a certain topic (such as iPhone 5). Furthermore, the dynamic change processes of usersโ€™ influence from longitudinal perspective are analysed and suggestions to the sales managers are provided

    Predicting Influencer Virality on Twitter

    Get PDF
    The ability to successfully predict virality on Twitter holds great potential as a resource for Twitter influencers, enabling the development of more sophisticated strategies for audience engagement, audience monetization, and information sharing. To our knowledge, focusing exclusively on tweets posted by influencers is a novel context for studying Twitter virality. We find, among feature categories traditionally considered in the literature, that combining categories covering a range of information performs better than models only incorporating individual feature categories. Moreover, our general predictive model, encompassing a range of feature categories, achieves a prediction accuracy of 68% for influencer virality. We also investigate the role of influencer audiences in predicting virality, a topic we believe to be understudied in the literature. We suspect that incorporating audience information will allow us to better discriminate between virality classes, thus leading to better predictions. We pursue two different approaches, resulting in 10 different predictive models that leverage influencer audience information in addition to traditional feature categories. Both of our attempts to incorporate audience information plateau at an accuracy of approximately 61%, roughly a 7% decrease in performance compared to our general predictive model. We conclude that we are unable to find experimental evidence to support our claim that incorporating influencer audience information will improve virality predictions. Nonetheless, the performance of our general model holds promise for the deployment of a tool that allows influencers to reap the benefits of virality prediction. As stronger performance from the underlying model would make this tool more useful in practice to influencers, improving the predictive performance of our general model is a cornerstone of future work

    Advances in Emotion Recognition: Link to Depressive Disorder

    Get PDF
    Emotion recognition enables real-time analysis, tagging, and inference of cognitive affective states from human facial expression, speech and tone, body posture and physiological signal, as well as social text on social network platform. Recognition of emotion pattern based on explicit and implicit features extracted through wearable and other devices could be decoded through computational modeling. Meanwhile, emotion recognition and computation are critical to detection and diagnosis of potential patients of mood disorder. The chapter aims to summarize the main findings in the area of affective recognition and its applications in major depressive disorder (MDD), which have made rapid progress in the last decade
    • โ€ฆ
    corecore