Search CORE

2,911 research outputs found

Social media analytics: a survey of techniques, tools and platforms

Author: Batrinca B
Treleaven PC
Publication venue
Publication date: 01/02/2015
Field of study

This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing

UCL Discovery

Crowd-sourced Photographic Content for Urban Recreational Route Planning

Author: Kachkaev A.
Wood J.
Publication venue
Publication date: 01/01/2013
Field of study

Routing services are able to provide travel directions for users of all modes of transport. Most of them are focusing on functional journeys (i.e. journeys linking given origin and destination with minimum cost) while paying less attention to recreational trips, in particular leisure walks in an urban context. These walks are additionally predefined by time or distance and as their purpose is the process of walking itself, the attractiveness of areas that are passed by can be an important factor in route selection. This factor is hard to be formalised and requires a reliable source of information, covering the entire street network. Previous research shows that crowd-sourced data available from photo-sharing services has a potential for being a measure of space attractiveness, thus becoming a base for a routing system that suggests leisure walks, and ongoing PhD research aims to build such system. This paper demonstrates findings on four investigated data sources (Flickr, Panoramio, Picasa and Geograph) in Central London and discusses the requirements to the algorithm that is going to be implemented in the second half of this PhD research. Visual analytics was chosen as a method for understanding and comparing obtained datasets that contain hundreds of thousands records. Interactive software was developed to find a number of problems, as well as to estimate the suitability of the sources in general. It was concluded that Picasa and Geograph have problems making them less suitable for further research while Panoramio and Flickr require filtering to remove photographs that do not contribute to understanding of local attractiveness. Based on this analysis a number of filtering methods were proposed in order to improve the quality of datasets and thus provide a more reliable measure to support urban recreational routing

CiteSeerX

City Research Online

Intelligent Management and Efficient Operation of Big Data

Author: Batista Fernando
Cardoso Elsa
Moura Jose
Nunes Luis
Publication venue
Publication date: 01/01/2015
Field of study

This chapter details how Big Data can be used and implemented in networking and computing infrastructures. Specifically, it addresses three main aspects: the timely extraction of relevant knowledge from heterogeneous, and very often unstructured large data sources, the enhancement on the performance of processing and networking (cloud) infrastructures that are the most important foundational pillars of Big Data applications or services, and novel ways to efficiently manage network infrastructures with high-level composed policies for supporting the transmission of large amounts of data with distinct requisites (video vs. non-video). A case study involving an intelligent management solution to route data traffic with diverse requirements in a wide area Internet Exchange Point is presented, discussed in the context of Big Data, and evaluated.Comment: In book Handbook of Research on Trends and Future Directions in Big Data and Web Intelligence, IGI Global, 201

arXiv.org e-Print Archive

Crossref

Repositório Institucional do ISCTE-IUL

Social information landscapes: automated mapping of large multimodal, longitudinal social networks

Author: Ch'ng Eugene
Publication venue: 'Emerald'
Publication date: 10/09/2015
Field of study

Purpose – This article presents a Big Data solution as a methodological approach to the automated collection, cleaning, collation and mapping of multimodal, longitudinal datasets from social media. The article constructs Social Information Landscapes. Design/methodology/approach – The research presented here adopts a Big Data methodological approach for mapping user-generated contents in social media. The methodology and algorithms presented are generic, and can be applied to diverse types of social media or user-generated contents involving user interactions, such as within blogs, comments in product pages and other forms of media, so long as a formal data structure proposed here can be constructed. Findings – The limited presentation of the sequential nature of content listings within social media and Web 2.0 pages, as viewed on Web browsers or on mobile devices, do not necessarily reveal nor make obvious an unknown nature of the medium; that every participant, from content producers, to consumers, to followers and subscribers, including the contents they produce or subscribed to, are intrinsically connected in a hidden but massive network. Such networks when mapped, could be quantitatively analysed using social network analysis (e.g., centralities), and the semantics and sentiments could equally reveal valuable information with appropriate analytics. Yet that which is difficult is the traditional approach of collecting, cleaning, collating and mapping such datasets into a sufficiently large sample of data that could yield important insights into the community structure and the directional, and polarity of interaction on diverse topics. This research solves this particular strand of problem. Research limitations/implications – The automated mapping of extremely large networks involving hundreds of thousands to millions of nodes, over a long period of time could possibly assist in the proving or even disproving of theories. The goal of this article is to demonstrate the feasibility of using automated approaches for acquiring massive, connected datasets for academic inquiry in the social sciences. Practical implications – The methods presented in this article, and the Big Data architecture presented here have great practical values to individuals and institutions which have low budgets. The software-hardward integrated architecture uses open source software, and the social information landscapes mapping algorithms are not difficult to implement. Originality/value – The majority of research in the literatures uses traditional approach for collecting social networks data. The traditional approach is slow, tedious and does not yield a large enough sample for the data to be significant for analysis. Whilst traditional approach collects only a small percentage of data, the original methods presented could possibility collect entire datasets in social media due to its scalability and automated mapping techniques

Nottingham ePrints

Nottingham eTheses

Crossref

Repository@Nottingham