11,826 research outputs found
Science Models as Value-Added Services for Scholarly Information Systems
The paper introduces scholarly Information Retrieval (IR) as a further
dimension that should be considered in the science modeling debate. The IR use
case is seen as a validation model of the adequacy of science models in
representing and predicting structure and dynamics in science. Particular
conceptualizations of scholarly activity and structures in science are used as
value-added search services to improve retrieval quality: a co-word model
depicting the cognitive structure of a field (used for query expansion), the
Bradford law of information concentration, and a model of co-authorship
networks (both used for re-ranking search results). An evaluation of the
retrieval quality when science model driven services are used turned out that
the models proposed actually provide beneficial effects to retrieval quality.
From an IR perspective, the models studied are therefore verified as expressive
conceptualizations of central phenomena in science. Thus, it could be shown
that the IR perspective can significantly contribute to a better understanding
of scholarly structures and activities.Comment: 26 pages, to appear in Scientometric
Quantifying the consistency of scientific databases
Science is a social process with far-reaching impact on our modern society.
In the recent years, for the first time we are able to scientifically study the
science itself. This is enabled by massive amounts of data on scientific
publications that is increasingly becoming available. The data is contained in
several databases such as Web of Science or PubMed, maintained by various
public and private entities. Unfortunately, these databases are not always
consistent, which considerably hinders this study. Relying on the powerful
framework of complex networks, we conduct a systematic analysis of the
consistency among six major scientific databases. We found that identifying a
single "best" database is far from easy. Nevertheless, our results indicate
appreciable differences in mutual consistency of different databases, which we
interpret as recipes for future bibliometric studies.Comment: 20 pages, 5 figures, 4 table
Biodiversity informatics: the challenge of linking data and the role of shared identifiers
A major challenge facing biodiversity informatics is integrating data stored in widely distributed databases. Initial efforts have relied on taxonomic names as the shared identifier linking records in different databases. However, taxonomic names have limitations as identifiers, being neither stable nor globally unique, and the pace of molecular taxonomic and phylogenetic research means that a lot of information in public sequence databases is not linked to formal taxonomic names. This review explores the use of other identifiers, such as specimen codes and GenBank accession numbers, to link otherwise disconnected facts in different databases. The structure of these links can also be exploited using the PageRank algorithm to rank the results of searches on biodiversity databases. The key to rich integration is a commitment to deploy and reuse globally unique, shared identifiers (such as DOIs and LSIDs), and the implementation of services that link those identifiers
How to Compare the Scientific Contributions between Research Groups
We present a method to analyse the scientific contributions between research
groups. Given multiple research groups, we construct their journal/proceeding
graphs and then compute the similarity/gap between them using network analysis.
This analysis can be used for measuring similarity/gap of the topics/qualities
between research groups' scientific contributions. We demonstrate the
practicality of our method by comparing the scientific contributions by Korean
researchers with those by the global researchers for information security in
2006 - 2008. The empirical analysis shows that the current security research in
South Korea has been isolated from the global research trend
The NASA Astrophysics Data System: Architecture
The powerful discovery capabilities available in the ADS bibliographic
services are possible thanks to the design of a flexible search and retrieval
system based on a relational database model. Bibliographic records are stored
as a corpus of structured documents containing fielded data and metadata, while
discipline-specific knowledge is segregated in a set of files independent of
the bibliographic data itself.
The creation and management of links to both internal and external resources
associated with each bibliography in the database is made possible by
representing them as a set of document properties and their attributes.
To improve global access to the ADS data holdings, a number of mirror sites
have been created by cloning the database contents and software on a variety of
hardware and software platforms.
The procedures used to create and manage the database and its mirrors have
been written as a set of scripts that can be run in either an interactive or
unsupervised fashion.
The ADS can be accessed at http://adswww.harvard.eduComment: 25 pages, 8 figures, 3 table
- …