Search CORE

4,582 research outputs found

Social media analytics: a survey of techniques, tools and platforms

Author: Batrinca B
Treleaven PC
Publication venue
Publication date: 01/02/2015
Field of study

This paper is written for (social science) researchers seeking to analyze the wealth of social media now available. It presents a comprehensive review of software tools for social networking media, wikis, really simple syndication feeds, blogs, newsgroups, chat and news feeds. For completeness, it also includes introductions to social media scraping, storage, data cleaning and sentiment analysis. Although principally a review, the paper also provides a methodology and a critique of social media tools. Analyzing social media, in particular Twitter feeds for sentiment analysis, has become a major research and business activity due to the availability of web-based application programming interfaces (APIs) provided by Twitter, Facebook and News services. This has led to an ‘explosion’ of data services, software tools for scraping and analysis and social media analytics platforms. It is also a research area undergoing rapid change and evolution due to commercial pressures and the potential for using social media data for computational (social science) research. Using a simple taxonomy, this paper provides a review of leading software tools and how to use them to scrape, cleanse and analyze the spectrum of social media. In addition, it discussed the requirement of an experimental computational environment for social media research and presents as an illustration the system architecture of a social media (analytics) platform built by University College London. The principal contribution of this paper is to provide an overview (including code fragments) for scientists seeking to utilize social media scraping and analytics either in their research or business. The data retrieval techniques that are presented in this paper are valid at the time of writing this paper (June 2014), but they are subject to change since social media data scraping APIs are rapidly changing

UCL Discovery

The Road Towards Reproducibility in Science: The Case of Data Citation

Author: Ferro Nicola
Silvello Gianmaria
Publication venue
Publication date: 01/01/2017
Field of study

Archivio istituzionale della ricerca - Università di Padova

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Online Research Database In Technology

The LAB@FUTURE Project - Moving Towards the Future of E-Learning

Author: Baudin Veronique
Faust Martin
Kaufmann Hannes
Litsa Vivian
Mwanza Daisy
Pierre Arnaud
Totter Alexandra
Publication venue
Publication date: 22/08/2004
Field of study

This paper presents Lab@Future, an advanced e-learning platform that uses novel Information and Communication Technologies to support and expand laboratory teaching practices. For this purpose, Lab@Future uses real and computer-generated objects that are interfaced using mechatronic systems, augmented reality, mobile technologies and 3D multi user environments. The main aim is to develop and demonstrate technological support for practical experiments in the following focused subjects namely: Fluid Dynamics - Science subject in Germany, Geometry - Mathematics subject in Austria, History and Environmental Awareness â€“ Arts and Humanities subjects in Greece and Slovenia. In order to pedagogically enhance the design and functional aspects of this e-learning technology, we are investigating the dialogical operationalisation of learning theories so as to leverage our understanding of teaching and learning practices in the targeted context of deployment

Crossref

Open Research Online (The Open University)

The University of Manchester - Institutional Repository

PRODUCT LIFECYCLE DATA SHARING AND VISUALISATION: WEB-BASED APPROACHES

Author: DS Kelly
Enrico Vezzetti
KP Cheng
M Salmela
P Figueroa
R Wagner
S Tornincasa
WD Li
Y Kim
Publication venue: Springer
Publication date: 01/01/2008
Field of study

Both product design and manufacturing are intrinsically collaborative processes. From conception and design to project completion and ongoing maintenance, all points in the lifecycle of any product involve the work of fluctuating teams of designers, suppliers and customers. That is why companies are involved in the creation of a distributed design and a manufacturing environment which could provide an effective way to communicate and share information throughout the entire enterprise and the supply chain. At present, the technologies that support such a strategy are based on World Wide Web platforms and follow two different paths. The first one focuses on 2D documentation improvement and introduces 3D interactive information in order to add knowledge to drawings. The second one works directly on 3D models and tries to extend the life of 3D data moving these design information downstream through the entire product lifecycle. Unfortunately the actual lack of a unique 3D Web-based standard has stimulated the growing up of many different proprietary and open source standards and, as a consequence, a production of an incompatible information exchange over the WEB. This paper proposes a structured analysis of Web-based solutions, trying to identify the most critical aspects to promote a unique 3D digital standard model capable of sharing product and manufacturing data more effectively—regardless of geographic boundaries, data structures, processes or computing environmen

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Online Analysis of Dynamic Streaming Data

Author: Kühn Eileen
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2018
Field of study

Die Arbeit zum Thema "Online Analysis of Dynamic Streaming Data" beschäftigt sich mit der Distanzmessung dynamischer, semistrukturierter Daten in kontinuierlichen Datenströmen um Analysen auf diesen Datenstrukturen bereits zur Laufzeit zu ermöglichen. Hierzu wird eine Formalisierung zur Distanzberechnung für statische und dynamische Bäume eingeführt und durch eine explizite Betrachtung der Dynamik von Attributen einzelner Knoten der Bäume ergänzt. Die Echtzeitanalyse basierend auf der Distanzmessung wird durch ein dichte-basiertes Clustering ergänzt, um eine Anwendung des Clustering, einer Klassifikation, aber auch einer Anomalieerkennung zu demonstrieren. Die Ergebnisse dieser Arbeit basieren auf einer theoretischen Analyse der eingeführten Formalisierung von Distanzmessungen für dynamische Bäume. Diese Analysen werden unterlegt mit empirischen Messungen auf Basis von Monitoring-Daten von Batchjobs aus dem Batchsystem des GridKa Daten- und Rechenzentrums. Die Evaluation der vorgeschlagenen Formalisierung sowie der darauf aufbauenden Echtzeitanalysemethoden zeigen die Effizienz und Skalierbarkeit des Verfahrens. Zudem wird gezeigt, dass die Betrachtung von Attributen und Attribut-Statistiken von besonderer Bedeutung für die Qualität der Ergebnisse von Analysen dynamischer, semistrukturierter Daten ist. Außerdem zeigt die Evaluation, dass die Qualität der Ergebnisse durch eine unabhängige Kombination mehrerer Distanzen weiter verbessert werden kann. Insbesondere wird durch die Ergebnisse dieser Arbeit die Analyse sich über die Zeit verändernder Daten ermöglicht

KITopen

Computer Science's Digest Volume 1

Author: Castillo Juan C.
Guerrero Jairo
Insuasti Jesús
Publication venue: Editorial Universitaria
Publication date: 01/01/2015
Field of study

This series is dedicated to the students of the Systems Department, to give them reading material related to computer science in a second language. This book covers the Introduction to Computer Science, Computer Communications, Networking and Web Applications

Universidad de Nariño

Theory and Practice of Data Citation

Author: Silvello Gianmaria
Publication venue: 'Wiley'
Publication date: 24/06/2017
Field of study

Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Università di Padova