133 research outputs found

    Social event detection with retweeting behavior correlation

    Get PDF
    Event detection over microblogs has attracted great research interest due to its wide application in crisis management and decision making etc. In natural disasters, complex events are reported in real time on social media sites, but these reports are invisible to crisis coordinators. Detecting these crisis events helps watchers to make right decisions rapidly, reducing injuries, deaths and economic loss. In sporting activities, detecting events helps audiences make better and more timely game viewing plans. However, existing event detection techniques are not effective at handling complex social events that evolve over time. In this paper, we propose an event detection method that takes advantage of retweeting behavior for handling the events evolution. Specifically, we first propose a topic model called RL-LDA to capture the social media information over hashtag, location, textual and retweeting behavior. Using RL-LDA, a complex event can be well handled by exploring the correlation between retweeting behavior and the event. Then to maintain the RL-LDA in a dynamic environment, we propose a dynamic update algorithm, which incrementally updates events over real time streams. Experiments over real-world datasets show that RL-LDA detects the temporal evolution of complex events effectively and efficiently

    Active Keyword Selection to Track Evolving Topics on Twitter

    Full text link
    How can we study social interactions on evolving topics at a mass scale? Over the past decade, researchers from diverse fields such as economics, political science, and public health have often done this by querying Twitter's public API endpoints with hand-picked topical keywords to search or stream discussions. However, despite the API's accessibility, it remains difficult to select and update keywords to collect high-quality data relevant to topics of interest. In this paper, we propose an active learning method for rapidly refining query keywords to increase both the yielded topic relevance and dataset size. We leverage a large open-source COVID-19 Twitter dataset to illustrate the applicability of our method in tracking Tweets around the key sub-topics of Vaccine, Mask, and Lockdown. Our experiments show that our method achieves an average topic-related keyword recall 2x higher than baselines. We open-source our code along with a web interface for keyword selection to make data collection from Twitter more systematic for researchers.Comment: 10 pages, 3 figure

    Hashtag Analysis of Indonesian COVID-19 Tweets Using Social Network Analysis

    Get PDF
    Social media has become more critical for people to communicate about the pandemic of COVID-19. In social media, hashtags are social annotations which often used to denote message content. It serves as an intuitive and flexible tool for making huge collections of posts searchable on Twitter. Through practices of hashtagging, user representations of a given post also become connected. This study aimed to analyze the hashtag of Indonesian COVID-19 Tweets using Social Network Analysis (SNA). We used SNA techniques to visualize network models and measure some centrality to find the most influential hashtag in the network. We collected and analyzed 500.000 public tweets from Twitter based on COVID-19 keywords. Based on the centrality measurement result, the hashtag #corona is a hashtag with the most connection with other hashtags. The hashtag #COVID19 is the hashtag that is most closely related to all other hashtags. The hashtag #corona is the hashtag that most acts as a bridge that can control the flow of information related to COVID-19. The hashtag #coronavirus is the most important of hashtags based on their link. Our study also found that the hashtag #covid19 and #wabah have a substantial relationship with religious-related hashtags based on network visualization

    Sub-story detection in Twitter with hierarchical Dirichlet processes

    Get PDF
    Social media has now become the de facto information source on real world events. The challenge, however, due to the high volume and velocity nature of social media streams, is in how to follow all posts pertaining to a given event over time – a task referred to as story detection. Moreover, there are often several different stories pertaining to a given event, which we refer to as sub-stories and the corresponding task of their automatic detection – as sub-story detection. This paper proposes hierarchical Dirichlet processes (HDP), a probabilistic topic model, as an effective method for automatic sub-story detection. HDP can learn sub-topics associated with sub-stories which enables it to handle subtle variations in sub-stories. It is compared with state-of-the-art story detection approaches based on locality sensitive hashing and spectral clustering. We demonstrate the superior performance of HDP for sub-story detection on real world Twitter data sets using various evaluation measures. The ability of HDP to learn sub-topics helps it to recall the sub-stories with high precision. This has resulted in an improvement of up to 60% in the F-score performance of HDP based sub-story detection approach compared to standard story detection approaches. A similar performance improvement is also seen using an information theoretic evaluation measure proposed for the sub-story detection task. Another contribution of this paper is in demonstrating that considering the conversational structures within the Twitter stream can bring up to 200% improvement in sub-story detection performance

    Detecting Political Framing Shifts and the Adversarial Phrases within\\ Rival Factions and Ranking Temporal Snapshot Contents in Social Media

    Get PDF
    abstract: Social Computing is an area of computer science concerned with dynamics of communities and cultures, created through computer-mediated social interaction. Various social media platforms, such as social network services and microblogging, enable users to come together and create social movements expressing their opinions on diverse sets of issues, events, complaints, grievances, and goals. Methods for monitoring and summarizing these types of sociopolitical trends, its leaders and followers, messages, and dynamics are needed. In this dissertation, a framework comprising of community and content-based computational methods is presented to provide insights for multilingual and noisy political social media content. First, a model is developed to predict the emergence of viral hashtag breakouts, using network features. Next, another model is developed to detect and compare individual and organizational accounts, by using a set of domain and language-independent features. The third model exposes contentious issues, driving reactionary dynamics between opposing camps. The fourth model develops community detection and visualization methods to reveal underlying dynamics and key messages that drive dynamics. The final model presents a use case methodology for detecting and monitoring foreign influence, wherein a state actor and news media under its control attempt to shift public opinion by framing information to support multiple adversarial narratives that facilitate their goals. In each case, a discussion of novel aspects and contributions of the models is presented, as well as quantitative and qualitative evaluations. An analysis of multiple conflict situations will be conducted, covering areas in the UK, Bangladesh, Libya and the Ukraine where adversarial framing lead to polarization, declines in social cohesion, social unrest, and even civil wars (e.g., Libya and the Ukraine).Dissertation/ThesisDoctoral Dissertation Computer Science 201

    People opinion topic model: opinion based user clustering in social networks

    Get PDF
    Mining various hot discussed topics and corresponding opinions from different groups of people in social media (e.g., Twitter) is very useful. For example, a decision maker in a company wants to know how different groups of people (customers, staff, competitors, etc.) think about their services, facilities, and things happened around. In this paper, we are focusing on the problem of finding opinion variations based on different groups of people and introducing the concept of opinion based community detection. Further, we also introduce a generative graphic model, namely People Opinion Topic (POT) model, which detects social communities, associated hot discussed topics, and perform sentiment analysis simultaneously by modelling user's social connections, common interests, and opinions in a unified way. This paper is the first attempt to study community and opinion mining together. Compared with traditional social communities detection, the detected communities by POT model are more interpretable and meaningful. In addition, we further analyse how diverse opinions distributed and propagated among various social communities. Experiments on real twitter dataset indicate our model is effective

    Hybrid intelligence for data mining

    Full text link
    Today, enormous amount of data are being recorded in all kinds of activities. This sheer size provides an excellent opportunity for data scientists to retrieve valuable information using data mining techniques. Due to the complexity of data in many neoteric problems, one-size-fits-all solutions are seldom able to provide satisfactory answers. Although the studies of data mining have been active, hybrid techniques are rarely scrutinized in detail. Currently, not many techniques can handle time-varying properties while performing their core functions, neither do they retrieve and combine information from heterogeneous dimensions, e.g., textual and numerical horizons. This thesis summarizes our investigations on hybrid methods to provide data mining solutions to problems involving non-trivial datasets, such as trajectories, microblogs, and financial data. First, time-varying dynamic Bayesian networks are extended to consider both causal and dynamic regularization requirements. Combining with density-based clustering, the enhancements overcome the difficulties in modeling spatial-temporal data where heterogeneous patterns, data sparseness and distribution skewness are common. Secondly, topic-based methods are proposed for emerging outbreak and virality predictions on microblogs. Complicated models that consider structural details are popular while others might have taken overly simplified assumptions to sacrifice accuracy for efficiency. Our proposed virality prediction solution delivers the benefits of both worlds. It considers the important characteristics of a structure yet without the burden of fine details to reduce complexity. Thirdly, the proposed topic-based approach for microblog mining is extended for sentiment prediction problems in finance. Sentiment-of-topic models are learned from both commentaries and prices for better risk management. Moreover, previously proposed, supervised topic model provides an avenue to associate market volatility with financial news yet it displays poor resolutions at extreme regions. To overcome this problem, extreme topic model is proposed to predict volatility in financial markets by using supervised learning. By mapping extreme events into Poisson point processes, volatile regions are magnified to reveal their hidden volatility-topic relationships. Lastly, some of the proposed hybrid methods are applied to service computing to verify that they are sufficiently generic for wider applications

    What’s Happening Around the World? A Survey and Framework on Event Detection Techniques on Twitter

    Full text link
    © 2019, Springer Nature B.V. In the last few years, Twitter has become a popular platform for sharing opinions, experiences, news, and views in real-time. Twitter presents an interesting opportunity for detecting events happening around the world. The content (tweets) published on Twitter are short and pose diverse challenges for detecting and interpreting event-related information. This article provides insights into ongoing research and helps in understanding recent research trends and techniques used for event detection using Twitter data. We classify techniques and methodologies according to event types, orientation of content, event detection tasks, their evaluation, and common practices. We highlight the limitations of existing techniques and accordingly propose solutions to address the shortcomings. We propose a framework called EDoT based on the research trends, common practices, and techniques used for detecting events on Twitter. EDoT can serve as a guideline for developing event detection methods, especially for researchers who are new in this area. We also describe and compare data collection techniques, the effectiveness and shortcomings of various Twitter and non-Twitter-based features, and discuss various evaluation measures and benchmarking methodologies. Finally, we discuss the trends, limitations, and future directions for detecting events on Twitter
    • …
    corecore