482 research outputs found

    Social media analytics: a survey of techniques, tools and platforms

    Get PDF
    This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing

    TwitInfo: Aggregating and Visualizing Microblogs for Event Exploration

    Get PDF
    Microblogs are a tremendous repository of user-generated content about world events. However, for people trying to understand events by querying services like Twitter, a chronological log of posts makes it very difficult to get a detailed understanding of an event. In this paper, we present TwitInfo, a system for visualizing and summarizing events on Twitter. TwitInfo allows users to browse a large collection of tweets using a timeline-based display that highlights peaks of high tweet activity. A novel streaming algorithm automatically discovers these peaks and labels them meaningfully using text from the tweets. Users can drill down to subevents, and explore further via geolocation, sentiment, and popular URLs. We contribute a recall-normalized aggregate sentiment visualization to produce more honest sentiment overviews. An evaluation of the system revealed that users were able to reconstruct meaningful summaries of events in a small amount of time. An interview with a Pulitzer Prize-winning journalist suggested that the system would be especially useful for understanding a long-running event and for identifying eyewitnesses. Quantitatively, our system can identify 80-100% of manually labeled peaks, facilitating a relatively complete view of each event studied

    Sentiment analysis in geo social streams by using machine learning technique

    Get PDF
    Dissertation submitted in partial fulfilment of the requirements for the degree of Master of Science in Geospatial TechnologiesMassive amounts of sentiment rich data are generated on social media in the form of Tweets, status updates, blog post, reviews, etc. Different people and organizations are using these user generated content for decision making. Symbolic techniques or Knowledge base approaches and Machine learning techniques are two main techniques used for analysis sentiments from text. The rapid increase in the volume of sentiment rich data on the web has resulted in an increased interaction among researchers regarding sentiment analysis and opinion (Kaushik & Mishra, 2014). However, limited research has been conducted considering location as another dimension along with the sentiment rich data. In this work, we analyze the sentiments of Geotweets, tweets containing latitude and longitude coordinates, and visualize the results in the form of a map in real time. We collect tweets from Twitter using its Streaming API, filtered by English language and location (bounding box). For those tweets which don’t have geographic coordinates, we geocode them using geocoder from GeoPy. Textblob, an open source library in python was used to calculate the sentiments of Geotweets. Map visualization was implemented using Leaflet. Plugins for clusters, heat maps and real-time have been used in this visualization. The visualization gives an insight of location sentiments

    Processing and visualizing the data in tweets

    Get PDF
    Microblogs such as Twitter provide a valuable stream of diverse user-generated data. While the data extracted from Twitter is generally timely and accurate, the process by which developers extract structured data from the tweet stream is ad-hoc and requires reimplementation of common data manipulation primitives. In this paper, we present two systems for querying and extracting structure from Twitter-embedded data. The first, TweeQL, provides a streaming SQL-like interface to the Twitter API, making common tweet processing tasks simpler. The second, TwitInfo, shows how end-users can interact with and understand aggregated data from the tweet stream, in addition to showcasing the power of the TweeQL language. Together these systems show the richness of content that can be extracted from Twitter

    CrimeTelescope: crime hotspot prediction based on urban and social media data fusion

    Get PDF
    Crime is a complex social issue impacting a considerable number of individuals within a society. Preventing and reducing crime is a top priority in many countries. Given limited policing and crime reduction resources, it is often crucial to identify effective strategies to deploy the available resources. Towards this goal, crime hotspot prediction has previously been suggested. Crime hotspot prediction leverages past data in order to identify geographical areas susceptible of hosting crimes in the future. However, most of the existing techniques in crime hotspot prediction solely use historical crime records to identify crime hotspots, while ignoring the predictive power of other data such as urban or social media data. In this paper, we propose CrimeTelescope, a platform that predicts and visualizes crime hotspots based on a fusion of different data types. Our platform continuously collects crime data as well as urban and social media data on the Web. It then extracts key features from the collected data based on both statistical and linguistic analysis. Finally, it identifies crime hotspots by leveraging the extracted features, and offers visualizations of the hotspots on an interactive map. Based on real-world data collected from New York City, we show that combining different types of data can effectively improve the crime hotspot prediction accuracy (by up to 5.2%), compared to classical approaches based on historical crime records only. In addition, we demonstrate the usability of our platform through a System Usability Scale (SUS) survey on a full prototype of CrimeTelescope
    corecore