1,646 research outputs found

    Winter is here: summarizing Twitter streams related to pre-scheduled events

    Get PDF
    Pre-scheduled events, such as TV shows and sports games, usually garner considerable attention from the public. Twitter captures large volumes of discussions and messages related to these events, in real-time. Twitter streams related to pre-scheduled events are characterized by the following: (1) spikes in the volume of published tweets reflect the highlights of the event and (2) some of the published tweets make reference to the characters involved in the event, in the context in which they are currently portrayed in a subevent. In this paper, we take advantage of these characteristics to identify the highlights of pre-scheduled events from tweet streams and we demonstrate a method to summarize these highlights. We evaluate our algorithm on tweets collected around 2 episodes of a popular TV show, Game of Thrones, Season 7.Published versio

    Semantics-driven event clustering in Twitter feeds

    Get PDF
    Detecting events using social media such as Twitter has many useful applications in real-life situations. Many algorithms which all use different information sources - either textual, temporal, geographic or community features - have been developed to achieve this task. Semantic information is often added at the end of the event detection to classify events into semantic topics. But semantic information can also be used to drive the actual event detection, which is less covered by academic research. We therefore supplemented an existing baseline event clustering algorithm with semantic information about the tweets in order to improve its performance. This paper lays out the details of the semantics-driven event clustering algorithms developed, discusses a novel method to aid in the creation of a ground truth for event detection purposes, and analyses how well the algorithms improve over baseline. We find that assigning semantic information to every individual tweet results in just a worse performance in F1 measure compared to baseline. If however semantics are assigned on a coarser, hashtag level the improvement over baseline is substantial and significant in both precision and recall

    Storia: Summarizing Social Media Content based on Narrative Theory using Crowdsourcing

    Full text link
    People from all over the world use social media to share thoughts and opinions about events, and understanding what people say through these channels has been of increasing interest to researchers, journalists, and marketers alike. However, while automatically generated summaries enable people to consume large amounts of data efficiently, they do not provide the context needed for a viewer to fully understand an event. Narrative structure can provide templates for the order and manner in which this data is presented to create stories that are oriented around narrative elements rather than summaries made up of facts. In this paper, we use narrative theory as a framework for identifying the links between social media content. To do this, we designed crowdsourcing tasks to generate summaries of events based on commonly used narrative templates. In a controlled study, for certain types of events, people were more emotionally engaged with stories created with narrative structure and were also more likely to recommend them to others compared to summaries created without narrative structure

    Estimating the spatial distribution of crime events around a football stadium from georeferenced tweets

    Get PDF
    Crowd-based events, such as football matches, are considered generators of crime. Criminological research on the influence of football matches has consistently uncovered differences in spatial crime patterns, particularly in the areas around stadia. At the same time, social media data mining research on football matches shows a high volume of data created during football events. This study seeks to build on these two research streams by exploring the spatial relationship between crime events and nearby Twitter activity around a football stadium, and estimating the possible influence of tweets for explaining the presence or absence of crime in the area around a football stadium on match days. Aggregated hourly crime data and geotagged tweets for the same area around the stadium are analysed using exploratory and inferential methods. Spatial clustering, spatial statistics, text mining as well as a hurdle negative binomial logistic regression for spatiotemporal explanations are utilized in our analysis. Findings indicate a statistically significant spatial relationship between three crime types (criminal damage, theft and handling, and violence against the person) and tweet patterns, and that such a relationship can be used to explain future incidents of crime

    Uncovering nodes that spread information between communities in social networks

    Get PDF
    From many datasets gathered in online social networks, well defined community structures have been observed. A large number of users participate in these networks and the size of the resulting graphs poses computational challenges. There is a particular demand in identifying the nodes responsible for information flow between communities; for example, in temporal Twitter networks edges between communities play a key role in propagating spikes of activity when the connectivity between communities is sparse and few edges exist between different clusters of nodes. The new algorithm proposed here is aimed at revealing these key connections by measuring a node's vicinity to nodes of another community. We look at the nodes which have edges in more than one community and the locality of nodes around them which influence the information received and broadcasted to them. The method relies on independent random walks of a chosen fixed number of steps, originating from nodes with edges in more than one community. For the large networks that we have in mind, existing measures such as betweenness centrality are difficult to compute, even with recent methods that approximate the large number of operations required. We therefore design an algorithm that scales up to the demand of current big data requirements and has the ability to harness parallel processing capabilities. The new algorithm is illustrated on synthetic data, where results can be judged carefully, and also on a real, large scale Twitter activity data, where new insights can be gained

    TwitInfo: Aggregating and Visualizing Microblogs for Event Exploration

    Get PDF
    Microblogs are a tremendous repository of user-generated content about world events. However, for people trying to understand events by querying services like Twitter, a chronological log of posts makes it very difficult to get a detailed understanding of an event. In this paper, we present TwitInfo, a system for visualizing and summarizing events on Twitter. TwitInfo allows users to browse a large collection of tweets using a timeline-based display that highlights peaks of high tweet activity. A novel streaming algorithm automatically discovers these peaks and labels them meaningfully using text from the tweets. Users can drill down to subevents, and explore further via geolocation, sentiment, and popular URLs. We contribute a recall-normalized aggregate sentiment visualization to produce more honest sentiment overviews. An evaluation of the system revealed that users were able to reconstruct meaningful summaries of events in a small amount of time. An interview with a Pulitzer Prize-winning journalist suggested that the system would be especially useful for understanding a long-running event and for identifying eyewitnesses. Quantitatively, our system can identify 80-100% of manually labeled peaks, facilitating a relatively complete view of each event studied

    Video Highlight Prediction Using Audience Chat Reactions

    Full text link
    Sports channel video portals offer an exciting domain for research on multimodal, multilingual analysis. We present methods addressing the problem of automatic video highlight prediction based on joint visual features and textual analysis of the real-world audience discourse with complex slang, in both English and traditional Chinese. We present a novel dataset based on League of Legends championships recorded from North American and Taiwanese Twitch.tv channels (will be released for further research), and demonstrate strong results on these using multimodal, character-level CNN-RNN model architectures.Comment: EMNLP 201

    Twitter Event Summarization Using Phrase Reinforcement Algorithm and NLP Features

    Get PDF
    Abstract-Now a day’s social networking sites are the fastest medium which delivers news to user as compare to the news paper and television. There so many social networking sites are present and one of them is Twitter. Twitter allows large no. of users to share/post their views, ideas on any particular event. According to recent survey daily 340 million Tweets are sent on Twitter which is on a different topic and only 4% of posts on Twitter have relevant news data. It is not possible for any human to read the posts to get meaningful information related to specific event. There is one solution to this problem i.e. we have to apply Summarization technique on it. In this paper we have used an algorithm which uses frequency count technique along with this we have also used some NLP features to summarize event specified by user. This automatic summarization algorithm handles the numerous, short, dissimilar, and noisy nature of tweets. We believe our novel approach helps users as well as researchers. DOI: 10.17762/ijritcc2321-8169.15020
    corecore