65,471 research outputs found
Theory and Practice of Data Citation
Citations are the cornerstone of knowledge propagation and the primary means
of assessing the quality of research, as well as directing investments in
science. Science is increasingly becoming "data-intensive", where large volumes
of data are collected and analyzed to discover complex patterns through
simulations and experiments, and most scientific reference works have been
replaced by online curated datasets. Yet, given a dataset, there is no
quantitative, consistent and established way of knowing how it has been used
over time, who contributed to its curation, what results have been yielded or
what value it has.
The development of a theory and practice of data citation is fundamental for
considering data as first-class research objects with the same relevance and
centrality of traditional scientific products. Many works in recent years have
discussed data citation from different viewpoints: illustrating why data
citation is needed, defining the principles and outlining recommendations for
data citation systems, and providing computational methods for addressing
specific issues of data citation.
The current panorama is many-faceted and an overall view that brings together
diverse aspects of this topic is still missing. Therefore, this paper aims to
describe the lay of the land for data citation, both from the theoretical (the
why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association
for Information Science and Technology (JASIST), 201
A visual exploration workflow as enabler for the exploitation of Linked Open Data
Abstract. Semantically annotating and interlinking Open Data results in Linked Open Data which concisely and unambiguously describes a knowledge domain. However, the uptake of the Linked Data depends on its usefulness to non-Semantic Web experts. Failing to support data consumers to understand the added-value of Linked Data and possible exploitation opportunities could inhibit its diffusion. In this paper, we propose an interactive visual workflow for discovering and ex-ploring Linked Open Data. We implemented the workflow considering academic library metadata and carried out a qualitative evaluation. We assessed the work-flow’s potential impact on data consumers which bridges the offer: published Linked Open Data; and the demand as requests for: (i) higher quality data; and (ii) more applications that re-use data. More than 70 % of the 34 test users agreed that the workflow fulfills its goal: it facilitates non-Semantic Web experts to un-derstand the potential of Linked Open Data.
Hypermedia-based discovery for source selection using low-cost linked data interfaces
Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong impact on selecting sources that contribute to the query results. Therefore, the authors introduce a discovery approach for Linked Data interfaces based on hypermedia links and controls, and apply it to federated query execution with Triple Pattern Fragments. In addition, the authors identify quantitative metrics to evaluate this discovery approach. This article describes generic evaluation measures and results for their concrete approach. With low-cost data summaries as seed, interfaces to eight large real-world datasets can discover each other within 7 minutes. Hypermedia-based client-side querying shows a promising gain of up to 50% in execution time, but demands algorithms that visit a higher number of interfaces to improve result completeness
Discovering Scholarly Orphans Using ORCID
Archival efforts such as (C)LOCKSS and Portico are in place to ensure the
longevity of traditional scholarly resources like journal articles. At the same
time, researchers are depositing a broad variety of other scholarly artifacts
into emerging online portals that are designed to support web-based
scholarship. These web-native scholarly objects are largely neglected by
current archival practices and hence they become scholarly orphans. We
therefore argue for a novel paradigm that is tailored towards archiving these
scholarly orphans. We are investigating the feasibility of using Open
Researcher and Contributor ID (ORCID) as a supporting infrastructure for the
process of discovery of web identities and scholarly orphans for active
researchers. We analyze ORCID in terms of coverage of researchers, subjects,
and location and assess the richness of its profiles in terms of web identities
and scholarly artifacts. We find that ORCID currently lacks in all considered
aspects and hence can only be considered in conjunction with other discovery
sources. However, ORCID is growing fast so there is potential that it could
achieve a satisfactory level of coverage and richness in the near future.Comment: 10 pages, 5 figures, 5 tables accepted for publication at JCDL 201
From Artifacts to Aggregations: Modeling Scientific Life Cycles on the Semantic Web
In the process of scientific research, many information objects are
generated, all of which may remain valuable indefinitely. However, artifacts
such as instrument data and associated calibration information may have little
value in isolation; their meaning is derived from their relationships to each
other. Individual artifacts are best represented as components of a life cycle
that is specific to a scientific research domain or project. Current cataloging
practices do not describe objects at a sufficient level of granularity nor do
they offer the globally persistent identifiers necessary to discover and manage
scholarly products with World Wide Web standards. The Open Archives
Initiative's Object Reuse and Exchange data model (OAI-ORE) meets these
requirements. We demonstrate a conceptual implementation of OAI-ORE to
represent the scientific life cycles of embedded networked sensor applications
in seismology and environmental sciences. By establishing relationships between
publications, data, and contextual research information, we illustrate how to
obtain a richer and more realistic view of scientific practices. That view can
facilitate new forms of scientific research and learning. Our analysis is
framed by studies of scientific practices in a large, multi-disciplinary,
multi-university science and engineering research center, the Center for
Embedded Networked Sensing (CENS).Comment: 28 pages. To appear in the Journal of the American Society for
Information Science and Technology (JASIST
How are topics born? Understanding the research dynamics preceding the emergence of new areas
The ability to promptly recognise new research trends is strategic for many stake- holders, including universities, institutional funding bodies, academic publishers and companies. While the literature describes several approaches which aim to identify the emergence of new research topics early in their lifecycle, these rely on the assumption that the topic in question is already associated with a number of publications and consistently referred to by a community of researchers. Hence, detecting the emergence of a new research area at an embryonic stage, i.e., before the topic has been consistently labelled by a community of researchers and associated with a number of publications, is still an open challenge. In this paper, we begin to address this challenge by performing a study of the dynamics preceding the creation of new topics. This study indicates that the emergence of a new topic is anticipated by a significant increase in the pace of collaboration between relevant research areas, which can be seen as the ‘parents’ of the new topic. These initial findings (i) confirm our hypothesis that it is possible in principle to detect the emergence of a new topic at the embryonic stage, (ii) provide new empirical evidence supporting relevant theories in Philosophy of Science, and also (iii) suggest that new topics tend to emerge in an environment in which weakly interconnected research areas begin to cross-fertilise
- …