Search CORE

46 research outputs found

Sustaining Collection Value: Managing Collection/Item Metadata Relationships

Author: Dubin David
Palmer Carole L.
Renear Allen H.
Urban Richard J.
Wickett Karen M.
Publication venue
Publication date: 01/06/2008
Field of study

Many aspects of managing collection/item metadata relationships are critical to sustaining collection value over time. Metadata at the collection-level not only provides context for finding, understanding, and using the items in the collection, but is often essential to the particular research and scholarly activities the collection is designed to support. Contemporary retrieval systems, which search across collections, usually ignore collection level metadata. Alternative approaches, informed by collection-level information, will require an understanding of the various kinds of relationships that can obtain between collection-level and item-level metadata. This paper outlines the problem and describes a project that is developing a logic-based framework for classifying collection-level/item-level metadata relationships. This framework will support (i) metadata specification developers defining metadata elements, (ii) metadata librarians describing objects, and (iii) system designers implementing systems that help users take advantage of collection-level metadata.Institute for Museum and Libary Services (Grant #LG06070020)published or submitted for publicationis peer reviewe

Illinois Digital Environment for Access to Learning and Scholarship Repository

Recommended from our members

Will Formal Preservation Models Require Relative Identity?

Author: Renear Allen H.
Sacchi Simone
Wickett Karen M.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

The problem of identifying and re–identifying data put the notion of of ”same data” at the very heart of preservation, integration and interoperability, and many other fundamental data curation activities. However, it is also a profoundly challenging notion because the concept of data itself clearly lacks a precise and univocal definition. When science is conducted in small communicating groups, with homogeneous data these ambiguities seldom create problems and solutions can be negotiated in casual real-time conversations. However when the data is heterogeneous in encoding, content and management practices, these problems can produce costly inefficiencies and lost opportunities. We consider here the relative identity view which apparently provides the most natural interpretation of common identity statements about digitally–encoded data. We show how this view conflicts with the curatorial and management practice of “data” objects, in terms of their modeling, and common knowledge representation strategies

Columbia University Academic Commons

Recommended from our members

Definitions of Dataset in the Scientific and Technical Literature

Author: Renear Allen H.
Sacchi Simone
Wickett Karen M.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2010
Field of study

The integration of heterogeneous data in varying formats and from diverse communities requires an improved understanding of the concept of a dataset, and of key related concepts, such as format, encoding, and version. Ultimately, a normative formal framework of such concepts will be needed to support the effective curation, integration, and use of shared multi-disciplinary scientific data. To prepare for the development of this framework we reviewed the definitions of dataset found in technical documentation and the scientific literature. Four basic features can be identified as common to most definitions: grouping, content, relatedness, and purpose. In this summary of our results we describe each of these features, indicating the directions a more formal analysis might take

Columbia University Academic Commons

Recommended from our members

A Framework for Applying the Concept of Significant Properties to Datasets

Author: Dubin David
Renear Allen H.
Sacchi Simone
Wickett Karen M.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

The concept of significant properties, properties that must be identified and preserved in any successful digital object preservation, is now common in data curation. Although this notion has clearly demonstrated its usefulness in cultural heritage domains its application to the preservation of scientific datasets is not as well developed. One obstacle to this application is that the familiar preservation models are not sufficiently explicit to identify the relevant entities, properties, and relationships involved in dataset preservation. We present a logic-based formal framework of dataset concepts that provides the levels of abstraction necessary to identify and correctly assign significant properties to their appropriate entities. A unique feature of this model is that it recognizes that a typed symbol structure is a unique requirement for datasets, but not for other information objects

Columbia University Academic Commons

Recommended from our members

One Thing is Missing or Two Things are Confused: An Analysis of OAIS Representation Information.

Author: Dubin David
Renear Allen H.
Sacchi Simone
Wickett Karen M.
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2011
Field of study

We describe two alternative interpretations of OAIS Representation Information (CCSDS, 2002), and show that both are flawed. The first is insufficient to formalize a model of preservation, and the second leads to category mistakes in conceptualizing the nature of digital artifacts. This analysis is based on earlier work developing a framework for the application of significant properties to datasets (Sacchi et al, 2011)

Columbia University Academic Commons

Fully Digital: Policy and Process Implications for the AAS

Author: A Odlyzko
A Renear
A Warnock
D Osterbrock
M Kurtz
R Netz
R-D Heuer
S Chandrasekhar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/04/2012
Field of study

Over the past two decades, every scholarly publisher has migrated at least the mechanical aspects of their journal publishing so that they utilize digital means. The academy was comfortable with that for a while, but publishers are under increasing pressure to adapt further. At the American Astronomical Society (AAS), we think that means bringing our publishing program to the point of being fully digital, by establishing procedures and policies that regard the digital objects of publication primarily. We have always thought about our electronic journals as databases of digital articles, from which we can publish and syndicate articles one at a time, and we must now put flesh on those bones by developing practices that are consistent with the realities of article at a time publication online. As a learned society that holds the long-term rights to the literature, we have actively taken responsibility for the preservation of the digital assets that constitute our journals, and in so doing we have not forsaken the legacy pre-digital assets. All of us who serve as the long-term stewards of scholarship must begin to evolve into fully digital publishers

arXiv.org e-Print Archive

Crossref

Foundations of Data Curation: The Pedagogy and Practice of "Purposeful Work" with Research Data

Author: Muñoz Trevor
Palmer Carole
Renear Allen H.
Weber Nicholas M.
Publication venue: Archives Journal
Publication date: 01/01/2013
Field of study

Increased interest in large-scale, publicly accessible data collections has made data curation critical to the management, preservation, and improvement of research data in the social and natural sciences, as well as the humanities. This paper explicates an approach to data curation education that integrates traditional notions of curation with principles and expertise from library, archival, and computer science. We begin by tracing the emergence of data curation as both a concept and a field of practice related to, but distinct from, both digital curation and data stewardship. This historical account, while far from definitive, considers perspectives from both the sciences and the humanities. Alongside traditional LIS and archival science practices, unique aspects of curation have informed our concept of “purposeful work” with data and, in turn, our pedagogical approach to data curation for the sciences and the humanities.Ope

Illinois Digital Environment for Access to Learning and Scholarship Repository

A Vision for User-Defined Semantic Markup

Author: Iorio Angelo Di
Peroni Silvio
Renear Allen H.
Rice Stanley
Sperberg-McQueen C. M.
Sperberg-McQueen C. M.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 23/09/2019
Field of study

Typesetting systems, such as LaTeX, permit users to define custom markup and corresponding formatting to simplify authoring, ensure the consistent presentation of domain-specific recurring elements and, potentially, enable further processing, such as the generation of an index of such elements. In XML-based and similar systems, the separation of content and form is also reflected in the processing pipeline: while document authors can define custom markup, they cannot define its semantics. This could be said to be intentional to ensure structural integrity of documents, but at the same time it limits the expressivity of markup. The latter is particularly true for so-called lightweight markup languages like Markdown, which only define very limited sets of generic elements. This vision paper sketches an approach for user-defined semantic markup that could permit authors to define the semantics of elements by formally describing the relations between its constituent parts and to other elements, and to define a formatting intent that would ensure that a default presentation is always available

Crossref

Serveur académique lausannois

Ontologies in Quantitative Biology: A Basis for Comparison, Integration, and Discovery

Author: A. G Murzin
A. H Renear
B. R Zeeberg
C Perez-Iratxeta
C. A Orengo
D. A Hosack
D. R Swanson
E. L Sonnhammer
F Al-Shahrour
G Joshi-Tope
H Ogata
J Schulz
K. D Dahlquist
L Montecchi-Palazzi
L. J Jensen
L. J Lu
Lars J. Jensen
M Campillos
M Selbach
M. E Aranguren
M. V Blagosklonny
N. L Washington
Peer Bork
R Hoehndorf
R. L Tatusov
S Kerrien
S. W Doniger
T Attwood
T. R Gruber
W. R Taylor
Publication venue: Public Library of Science
Publication date: 01/05/2010
Field of study

As biology is becoming a data-driven discipline, ontologies become increasingly important for systematically capturing the existing knowledge. This essay discusses current trends and how ontologies can also be used for discovery

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Copenhagen University Research Information System

MDC Repository

Theoretical and technological building blocks for an innovation accelerator

The scientific system that we use today was devised centuries ago and is inadequate for our current ICT-based society: the peer review system encourages conservatism, journal publications are monolithic and slow, data is often not available to other scientists, and the independent validation of results is limited. Building on the Innovation Accelerator paper by Helbing and Balietti (2011) this paper takes the initial global vision and reviews the theoretical and technological building blocks that can be used for implementing an innovation (in first place: science) accelerator platform driven by re-imagining the science system. The envisioned platform would rest on four pillars: (i) Redesign the incentive scheme to reduce behavior such as conservatism, herding and hyping; (ii) Advance scientific publications by breaking up the monolithic paper unit and introducing other building blocks such as data, tools, experiment workflows, resources; (iii) Use machine readable semantics for publications, debate structures, provenance etc. in order to include the computer as a partner in the scientific process, and (iv) Build an online platform for collaboration, including a network of trust and reputation among the different types of stakeholders in the scientific system: scientists, educators, funding agencies, policy makers, students and industrial innovators among others. Any such improvements to the scientific system must support the entire scientific process (unlike current tools that chop up the scientific process into disconnected pieces), must facilitate and encourage collaboration and interdisciplinarity (again unlike current tools), must facilitate the inclusion of intelligent computing in the scientific process, must facilitate not only the core scientific process, but also accommodate other stakeholders such science policy makers, industrial innovators, and the general public

arXiv.org e-Print Archive

Crossref

VU Research Portal

EDP Sciences OAI-PMH repository (1.2.0)

Springer - Publisher Connector

Edinburgh Research Explorer

Leiden University Scholary Publications

The University of Manchester - Institutional Repository

Access to Research at National University of Ireland, Galway