2,312 research outputs found

    The Symbiotic Relationship Between Information Retrieval and Informetrics

    Get PDF
    Informetrics and information retrieval (IR) represent fundamental areas of study within information science. Historically, researchers have not fully capitalized on the potential research synergies that exist between these two areas. Data sources used in traditional informetrics studies have their analogues in IR, with similar types of empirical regularities found in IR system content and use. Methods for data collection and analysis used in informetrics can help to inform IR system development and evaluation. Areas of application have included automatic indexing, index term weighting and understanding user query and session patterns through the quantitative analysis of user transaction logs. Similarly, developments in database technology have made the study of informetric phenomena less cumbersome, and recent innovations used in IR research, such as language models and ranking algorithms, provide new tools that may be applied to research problems of interest to informetricians. Building on the author’s previous work (Wolfram 2003), this paper reviews a sample of relevant literature published primarily since 2000 to highlight how each area of study may help to inform and benefit the other

    An evaluation of Bradfordizing effects

    Get PDF
    The purpose of this paper is to apply and evaluate the bibliometric method Bradfordizing for information retrieval (IR) experiments. Bradfordizing is used for generating core document sets for subject-specific questions and to reorder result sets from distributed searches. The method will be applied and tested in a controlled scenario of scientific literature databases from social and political sciences, economics, psychology and medical science (SOLIS, SoLit, USB Köln Opac, CSA Sociological Abstracts, World Affairs Online, Psyndex and Medline) and 164 standardized topics. An evaluation of the method and its effects is carried out in two laboratory-based information retrieval experiments (CLEF and KoMoHe) using a controlled document corpus and human relevance assessments. The results show that Bradfordizing is a very robust method for re-ranking the main document types (journal articles and monographs) in today’s digital libraries (DL). The IR tests show that relevance distributions after re-ranking improve at a significant level if articles in the core are compared with articles in the succeeding zones. The items in the core are significantly more often assessed as relevant, than items in zone 2 (z2) or zone 3 (z3). The improvements between the zones are statistically significant based on the Wilcoxon signed-rank test and the paired T-Test

    Science Models as Value-Added Services for Scholarly Information Systems

    Full text link
    The paper introduces scholarly Information Retrieval (IR) as a further dimension that should be considered in the science modeling debate. The IR use case is seen as a validation model of the adequacy of science models in representing and predicting structure and dynamics in science. Particular conceptualizations of scholarly activity and structures in science are used as value-added search services to improve retrieval quality: a co-word model depicting the cognitive structure of a field (used for query expansion), the Bradford law of information concentration, and a model of co-authorship networks (both used for re-ranking search results). An evaluation of the retrieval quality when science model driven services are used turned out that the models proposed actually provide beneficial effects to retrieval quality. From an IR perspective, the models studied are therefore verified as expressive conceptualizations of central phenomena in science. Thus, it could be shown that the IR perspective can significantly contribute to a better understanding of scholarly structures and activities.Comment: 26 pages, to appear in Scientometric

    Scopus's Source Normalized Impact per Paper (SNIP) versus a Journal Impact Factor based on Fractional Counting of Citations

    Full text link
    Impact factors (and similar measures such as the Scimago Journal Rankings) suffer from two problems: (i) citation behavior varies among fields of science and therefore leads to systematic differences, and (ii) there are no statistics to inform us whether differences are significant. The recently introduced SNIP indicator of Scopus tries to remedy the first of these two problems, but a number of normalization decisions are involved which makes it impossible to test for significance. Using fractional counting of citations-based on the assumption that impact is proportionate to the number of references in the citing documents-citations can be contextualized at the paper level and aggregated impacts of sets can be tested for their significance. It can be shown that the weighted impact of Annals of Mathematics (0.247) is not so much lower than that of Molecular Cell (0.386) despite a five-fold difference between their impact factors (2.793 and 13.156, respectively)

    Editorial for the First Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics

    Full text link
    The workshop "Mining Scientific Papers: Computational Linguistics and Bibliometrics" (CLBib 2015), co-located with the 15th International Society of Scientometrics and Informetrics Conference (ISSI 2015), brought together researchers in Bibliometrics and Computational Linguistics in order to study the ways Bibliometrics can benefit from large-scale text analytics and sense mining of scientific papers, thus exploring the interdisciplinarity of Bibliometrics and Natural Language Processing (NLP). The goals of the workshop were to answer questions like: How can we enhance author network analysis and Bibliometrics using data obtained by text analytics? What insights can NLP provide on the structure of scientific writing, on citation networks, and on in-text citation analysis? This workshop is the first step to foster the reflection on the interdisciplinarity and the benefits that the two disciplines Bibliometrics and Natural Language Processing can drive from it.Comment: 4 pages, Workshop on Mining Scientific Papers: Computational Linguistics and Bibliometrics at ISSI 201

    DataCite as a novel bibliometric source: Coverage, strengths and limitations

    Get PDF
    This paper explores the characteristics of DataCite to determine its possibilities and potential as a new bibliometric data source to analyze the scholarly production of open data. Open science and the increasing data sharing requirements from governments, funding bodies, institutions and scientific journals has led to a pressing demand for the development of data metrics. As a very first step towards reliable data metrics, we need to better comprehend the limitations and caveats of the information provided by sources of open data. In this paper, we critically examine records downloaded from the DataCite's OAI API and elaborate a series of recommendations regarding the use of this source for bibliometric analyses of open data. We highlight issues related to metadata incompleteness, lack of standardization, and ambiguous definitions of several fields. Despite these limitations, we emphasize DataCite's value and potential to become one of the main sources for data metrics development.Comment: Paper accepted for publication in Journal of Informetric

    Usage History of Scientific Literature: Nature Metrics and Metrics of Nature Publications

    Get PDF
    In this study, we analyze the dynamic usage history of Nature publications over time using Nature metrics data. We conduct analysis from two perspectives. On the one hand, we examine how long it takes before the articles' downloads reach 50%/80% of the total; on the other hand, we compare the percentage of total downloads in 7 days, 30 days, and 100 days after publication. In general, papers are downloaded most frequently within a short time period right after their publication. And we find that compared with Non-Open Access papers, readers' attention on Open Access publications are more enduring. Based on the usage data of a newly published paper, regression analysis could predict the future expected total usage counts.Comment: 11 pages, 5 figures and 4 table

    A comparison of two techniques for bibliometric mapping: Multidimensional scaling and VOS

    Get PDF
    VOS is a new mapping technique that can serve as an alternative to the well-known technique of multidimensional scaling. We present an extensive comparison between the use of multidimensional scaling and the use of VOS for constructing bibliometric maps. In our theoretical analysis, we show the mathematical relation between the two techniques. In our experimental analysis, we use the techniques for constructing maps of authors, journals, and keywords. Two commonly used approaches to bibliometric mapping, both based on multidimensional scaling, turn out to produce maps that suffer from artifacts. Maps constructed using VOS turn out not to have this problem. We conclude that in general maps constructed using VOS provide a more satisfactory representation of a data set than maps constructed using well-known multidimensional scaling approaches
    • …
    corecore