Search CORE

17,194 research outputs found

DARIAH and the Benelux

Author: Backes Marianne
Chambers Sally
Hoogerwerf Maarten
Van der West Jan
Publication venue: Department of Applied Linguistics, Translators and Interpreters, University of Antwerp
Publication date: 01/01/2015
Field of study

Scraping the Social? Issues in live social research

Author: Back L.
Becker H.
Bollier D.
Bowker G. C.
Danowski J. A.
Duhem P. M. M.
Esther Weltevrede
Haraway D.
Knorr-Cetina K.
Latour B.
Latour B.
Latour B.
Marres N.
Marres N.
Moens M. F.
Noortje Marres
Rheinberger H. J.
Rogers R.
Rogers R.
Woolgar S.
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2013
Field of study

What makes scraping methodologically interesting for social and cultural research? This paper seeks to contribute to debates about digital social research by exploring how a ‘medium-specific’ technique for online data capture may be rendered analytically productive for social research. As a device that is currently being imported into social research, scraping has the capacity to re-structure social research, and this in at least two ways. Firstly, as a technique that is not native to social research, scraping risks to introduce ‘alien’ methodological assumptions into social research (such as an pre-occupation with freshness). Secondly, to scrape is to risk importing into our inquiry categories that are prevalent in the social practices enabled by the media: scraping makes available already formatted data for social research. Scraped data, and online social data more generally, tend to come with ‘external’ analytics already built-in. This circumstance is often approached as a ‘problem’ with online data capture, but we propose it may be turned into virtue, insofar as data formats that have currency in the areas under scrutiny may serve as a source of social data themselves. Scraping, we propose, makes it possible to render traffic between the object and process of social research analytically productive. It enables a form of ‘real-time’ social research, in which the formats and life cycles of online data may lend structure to the analytic objects and findings of social research. By way of a conclusion, we demonstrate this point in an exercise of online issue profiling, and more particularly, by relying on Twitter to profile the issue of ‘austerity’. Here we distinguish between two forms of real-time research, those dedicated to monitoring live content (which terms are current?) and those concerned with analysing the liveliness of issues (which topics are happening?)

Goldsmiths Research Online

Crossref

International Migration, Integration and Social Cohesion online publications

The Archives Unleashed Project: Technology, Process, and Community to Improve Scholarly Access to Web Archives

Author: Fritz Samantha
Lin Jimmy
Milligan Ian
Ruest Nick
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/01/2020
Field of study

The Archives Unleashed project aims to improve scholarly access to web archives through a multi-pronged strategy involving tool creation, process modeling, and community building -- all proceeding concurrently in mutually --reinforcing efforts. As we near the end of our initially-conceived three-year project, we report on our progress and share lessons learned along the way. The main contribution articulated in this paper is a process model that decomposes scholarly inquiries into four main activities: filter, extract, aggregate, and visualize. Based on the insight that these activities can be disaggregated across time, space, and tools, it is possible to generate "derivative products", using our Archives Unleashed Toolkit, that serve as useful starting points for scholarly inquiry. Scholars can download these products from the Archives Unleashed Cloud and manipulate them just like any other dataset, thus providing access to web archives without requiring any specialized knowledge. Over the past few years, our platform has processed over a thousand different collections from over two hundred users, totaling around 300 terabytes of web archives.This research was supported by the Andrew W. Mellon Foundation, the Social Sciences and Humanities Research Council of Canada, as well as Start Smart Labs, Compute Canada, the University of Waterloo, and York University. We’d like to thank Jeremy Wiebe, Ryan Deschamps, and Gursimran Singh for their contributions

arXiv.org e-Print Archive

Crossref

YorkSpace

Astrolabe: Curating, Linking and Computing Astronomy's Dark Data

Author: Heidorn P. Bryan
Stahlman Gretchen R.
Steffen Julie
Publication venue: 'American Astronomical Society'
Publication date: 26/02/2018
Field of study

Where appropriate repositories are not available to support all relevant astronomical data products, data can fall into darkness: unseen and unavailable for future reference and re-use. Some data in this category are legacy or old data, but newer datasets are also often uncurated and could remain "dark". This paper provides a description of the design motivation and development of Astrolabe, a cyberinfrastructure project that addresses a set of community recommendations for locating and ensuring the long-term curation of dark or otherwise at-risk data and integrated computing. This paper also describes the outcomes of the series of community workshops that informed creation of Astrolabe. According to participants in these workshops, much astronomical dark data currently exist that are not curated elsewhere, as well as software that can only be executed by a few individuals and therefore becomes unusable because of changes in computing platforms. Astronomical research questions and challenges would be better addressed with integrated data and computational resources that fall outside the scope of existing observatory and space mission projects. As a solution, the design of the Astrolabe system is aimed at developing new resources for management of astronomical data. The project is based in CyVerse cyberinfrastructure technology and is a collaboration between the University of Arizona and the American Astronomical Society. Overall the project aims to support open access to research data by leveraging existing cyberinfrastructure resources and promoting scientific discovery by making potentially-useful data in a computable format broadly available to the astronomical community.Comment: Accepted for publication in the Astrophysical Journal Supplement Series, 22 pages, 2 figure

arXiv.org e-Print Archive

Crossref

The University of Arizona

Survey and Analysis of Production Distributed Computing Infrastructures

Author: Jha Shantenu
Katz Daniel S.
Parashar Manish
Rana Omer
Weissman Jon
Publication venue
Publication date: 13/08/2012
Field of study

This report has two objectives. First, we describe a set of the production distributed infrastructures currently available, so that the reader has a basic understanding of them. This includes explaining why each infrastructure was created and made available and how it has succeeded and failed. The set is not complete, but we believe it is representative. Second, we describe the infrastructures in terms of their use, which is a combination of how they were designed to be used and how users have found ways to use them. Applications are often designed and created with specific infrastructures in mind, with both an appreciation of the existing capabilities provided by those infrastructures and an anticipation of their future capabilities. Here, the infrastructures we discuss were often designed and created with specific applications in mind, or at least specific types of applications. The reader should understand how the interplay between the infrastructure providers and the users leads to such usages, which we call usage modalities. These usage modalities are really abstractions that exist between the infrastructures and the applications; they influence the infrastructures by representing the applications, and they influence the ap- plications by representing the infrastructures

arXiv.org e-Print Archive

FigShare

Century-of-Information Research – a Strategy for Research and Innovation in the Century of Information

Author: Atkinson M. P.
Britton D
De Roure DE
Garnett N
Geddes N
Gurney R
Haines K
Hughes L
Jeffreys P
Perrott R
Procter RN
Trefethen A. E.
Publication venue: No publisher name
Publication date: 01/01/2008
Field of study

The University of Manchester - Institutional Repository