139,861 research outputs found
A new methodology for constructing a publication-level classification system of science
Classifying journals or publications into research areas is an essential
element of many bibliometric analyses. Classification usually takes place at
the level of journals, where the Web of Science subject categories are the most
popular classification system. However, journal-level classification systems
have two important limitations: They offer only a limited amount of detail, and
they have difficulties with multidisciplinary journals. To avoid these
limitations, we introduce a new methodology for constructing classification
systems at the level of individual publications. In the proposed methodology,
publications are clustered into research areas based on citation relations. The
methodology is able to deal with very large numbers of publications. We present
an application in which a classification system is produced that includes
almost ten million publications. Based on an extensive analysis of this
classification system, we discuss the strengths and the limitations of the
proposed methodology. Important strengths are the transparency and relative
simplicity of the methodology and its fairly modest computing and memory
requirements. The main limitation of the methodology is its exclusive reliance
on direct citation relations between publications. The accuracy of the
methodology can probably be increased by also taking into account other types
of relations, for instance based on bibliographic coupling
A systematic empirical comparison of different approaches for normalizing citation impact indicators
We address the question how citation-based bibliometric indicators can best
be normalized to ensure fair comparisons between publications from different
scientific fields and different years. In a systematic large-scale empirical
analysis, we compare a traditional normalization approach based on a field
classification system with three source normalization approaches. We pay
special attention to the selection of the publications included in the
analysis. Publications in national scientific journals, popular scientific
magazines, and trade magazines are not included. Unlike earlier studies, we use
algorithmically constructed classification systems to evaluate the different
normalization approaches. Our analysis shows that a source normalization
approach based on the recently introduced idea of fractional citation counting
does not perform well. Two other source normalization approaches generally
outperform the classification-system-based normalization approach that we
study. Our analysis therefore offers considerable support for the use of
source-normalized bibliometric indicators
"Seed+Expand": A validated methodology for creating high quality publication oeuvres of individual researchers
The study of science at the individual micro-level frequently requires the
disambiguation of author names. The creation of author's publication oeuvres
involves matching the list of unique author names to names used in publication
databases. Despite recent progress in the development of unique author
identifiers, e.g., ORCID, VIVO, or DAI, author disambiguation remains a key
problem when it comes to large-scale bibliometric analysis using data from
multiple databases. This study introduces and validates a new methodology
called seed+expand for semi-automatic bibliographic data collection for a given
set of individual authors. Specifically, we identify the oeuvre of a set of
Dutch full professors during the period 1980-2011. In particular, we combine
author records from the National Research Information System (NARCIS) with
publication records from the Web of Science. Starting with an initial list of
8,378 names, we identify "seed publications" for each author using five
different approaches. Subsequently, we "expand" the set of publication in three
different approaches. The different approaches are compared and resulting
oeuvres are evaluated on precision and recall using a "gold standard" dataset
of authors for which verified publications in the period 2001-2010 are
available.Comment: Paper accepted for the ISSI 2013, small changes in the text due to
referee comments, one figure added (Fig 3
Recommended from our members
Reusability in software engineering
This paper surveys recent work concerning reusability in software engineering. The current directions in software reusability are discussed, and the two major approaches of reusable building blocks and reusable patterns studied. An extensive bibliography, parts of which are annotated, is included
An overview of decision table literature 1982-1995.
This report gives an overview of the literature on decision tables over the past 15 years. As much as possible, for each reference, an author supplied abstract, a number of keywords and a classification are provided. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. The literature is classified according to application area, theoretical versus practical character, year of publication, country or origin (not necessarily country of publication) and the language of the document. After a description of the scope of the interview, classification results and the classification by topic are presented. The main body of the paper is the ordered list of publications with abstract, classification and comments.
- …