15,191 research outputs found

    Transformative Effects of NDIIPP, the Case of the Henry A. Murray Archive

    Get PDF
    This article comprises reflections on the changes to the Henry A. Murray Research Archive, catalyzed by involvement with the National Digital Information Infrastructure and Preservation Program (NDIIPP) partnership, and the accompanying introduction of next generation digital library software. Founded in 1976 at Radcliffe, the Henry A. Murray Research Archive is the endowed, permanent repository for quantitative and qualitative research data at the Institute for Quantitative Social Science, in Harvard University. The Murray preserves in perpetuity all types of data of interest to the research community, including numerical, video, audio, interview notes, and other types. The center is unique among data archives in the United States in the extent of its holdings in quantitative, qualitative, and mixed quantitativequalitative research. The Murray took part in an NDIIPP-funded collaboration with four other archival partners, Data-PASS, for the purpose of the identification and acquisition of data at risk, and the joint development of best practices with respect to shared stewardship, preservation, and exchange of these data. During this time, the Dataverse Network (DVN) software was introduced, facilitating the creation of virtual archives. The combination of institutional collaboration and new technology lead the Murray to re-engineer its entire acquisition process; completely rewrite its ingest, dissemination, and other licensing agreements; and adopt a new model for ingest, discovery, access, and presentation of its collections. Through the Data-PASS project, the Murray has acquired a number of important data collections. The resulting changes within the Murray have been dramatic, including increasing its overall rate of acquisitions by fourfold; and disseminating acquisitions far more rapidly. Furthermore, the new licensing and processing procedures allow a previously undreamed of level of interoperability and collaboration with partner archives, facilitating integrated discovery and presentation services, and joint stewardship of collections.published or submitted for publicatio

    High-resolution computed tomography reconstructions of invertebrate burrow systems

    Get PDF
    The architecture of biogenic structures can be highly influential in determining species contributions to major soil and sediment processes, but detailed 3-D characterisations are rare and descriptors of form and complexity are lacking. Here we provide replicate high-resolution micro-focus computed tomography (μ-CT) data for the complete burrow systems of three co-occurring, but functionally contrasting, sediment-dwelling inter-tidal invertebrates assembled alone, and in combination, in representative model aquaria. These data (≤2,000 raw image slices aquarium−1, isotropic voxel resolution, 81 μm) provide reference models that can be used for the development of novel structural analysis routines that will be of value within the fields of ecology, pedology, geomorphology, palaeobiology, ichnology and mechanical engineering. We also envisage opportunity for those investigating transport networks, vascular systems, plant rooting systems, neuron connectivity patterns, or those developing image analysis or statistics related to pattern or shape recognition. The dataset will allow investigators to develop or test novel methodology and ideas without the need to generate a complete three-dimensional computation of exemplar architecture

    Theory and Practice of Data Citation

    Full text link
    Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

    e-Science Infrastructure for the Social Sciences

    Get PDF
    When the term „e-Science“ became popular, it frequently was referred to as “enhanced science” or “electronic science”. More telling is the definition ‘e-Science is about global collaboration in key areas of science and the next generation of infrastructure that will enable it’ (Taylor, 2001). The question arises to what extent can the social sciences profit from recent developments in e- Science infrastructure? While computing, storage and network capacities so far were sufficient to accommodate and access social science data bases, new capacities and technologies support new types of research, e.g. linking and analysing transactional or audio-visual data. Increasingly collaborative working by researchers in distributed networks is efficiently supported and new resources are available for e-learning. Whether these new developments become transformative or just helpful will very much depend on whether their full potential is recognized and creatively integrated into new research designs by theoretically innovative scientists. Progress in e-Science was very much linked to the vision of the Grid as “a software infrastructure that enables flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions and resources’ and virtually unlimited computing capacities (Foster et al. 2000). In the Social Sciences there has been considerable progress in using modern IT- technologies for multilingual access to virtual distributed research databases across Europe and beyond (e.g. NESSTAR, CESSDA – Portal), data portals for access to statistical offices and for linking access to data, literature, project, expert and other data bases (e.g. Digital Libraries, VASCODA/SOWIPORT). Whether future developments will need GRID enabling of social science databases or can be further developed using WEB 2.0 support is currently an open question. The challenges here are seamless integration and interoperability of data bases, a requirement that is also stipulated by internationalisation and trans-disciplinary research. This goes along with the need for standards and harmonisation of data and metadata. Progress powered by e- infrastructure is, among others, dependent on regulatory frameworks and human capital well trained in both, data science and research methods. It is also dependent on sufficient critical mass of the institutional infrastructure to efficiently support a dynamic research community that wants to “take the lead without catching up”.

    A Model for Data Citation in Astronomical Research using Digital Object Identifiers (DOIs)

    Full text link
    Standardizing and incentivizing the use of digital object identifiers (DOIs) to aggregate and identify both data analyzed and data generated by a research project will advance the field of astronomy to match best practices in other research fields like geosciences and medicine. Increase in the use of DOIs will prepare the discipline for changing expectations among funding agencies and publishers, who increasingly expect accurate and thorough data citation to accompany scientific outputs. The use of DOIs ensures a robust, sustainable, and interoperable approach to data citation in which due credit is given to researchers and institutions who produce and maintain the primary data. We describe in this work the advantages of DOIs for data citation and best practices for integrating a DOI service in an astronomical archive. We report on a pilot project carried out in collaboration with AAS Journals. During the course of the 1.5 year pilot, over 75% of submitting authors opted to use the integrated DOI service to clearly identify data analyzed during their research project when prompted at the time of paper submission.Comment: 13 pages, 3 figures. Accepted on Dec 19, 2017 for publication in Astrophysical Journal Supplement Serie

    AsterixDB: A Scalable, Open Source BDMS

    Full text link
    AsterixDB is a new, full-function BDMS (Big Data Management System) with a feature set that distinguishes it from other platforms in today's open source Big Data ecosystem. Its features make it well-suited to applications like web data warehousing, social data storage and analysis, and other use cases related to Big Data. AsterixDB has a flexible NoSQL style data model; a query language that supports a wide range of queries; a scalable runtime; partitioned, LSM-based data storage and indexing (including B+-tree, R-tree, and text indexes); support for external as well as natively stored data; a rich set of built-in types; support for fuzzy, spatial, and temporal types and queries; a built-in notion of data feeds for ingestion of data; and transaction support akin to that of a NoSQL store. Development of AsterixDB began in 2009 and led to a mid-2013 initial open source release. This paper is the first complete description of the resulting open source AsterixDB system. Covered herein are the system's data model, its query language, and its software architecture. Also included are a summary of the current status of the project and a first glimpse into how AsterixDB performs when compared to alternative technologies, including a parallel relational DBMS, a popular NoSQL store, and a popular Hadoop-based SQL data analytics platform, for things that both technologies can do. Also included is a brief description of some initial trials that the system has undergone and the lessons learned (and plans laid) based on those early "customer" engagements

    Unproceedings of the Fourth .Astronomy Conference (.Astronomy 4), Heidelberg, Germany, July 9-11 2012

    Full text link
    The goal of the .Astronomy conference series is to bring together astronomers, educators, developers and others interested in using the Internet as a medium for astronomy. Attendance at the event is limited to approximately 50 participants, and days are split into mornings of scheduled talks, followed by 'unconference' afternoons, where sessions are defined by participants during the course of the event. Participants in unconference sessions are discouraged from formal presentations, with discussion, workshop-style formats or informal practical tutorials encouraged. The conference also designates one day as a 'hack day', in which attendees collaborate in groups on day-long projects for presentation the following morning. These hacks are often a way of concentrating effort, learning new skills, and exploring ideas in a practical fashion. The emphasis on informal, focused interaction makes recording proceedings more difficult than for a normal meeting. While the first .Astronomy conference is preserved formally in a book, more recent iterations are not documented. We therefore, in the spirit of .Astronomy, report 'unproceedings' from .Astronomy 4, which was held in Heidelberg in July 2012.Comment: 11 pages, 1 figure, .Astronomy 4, #dotastr
    corecore