13,319 research outputs found

    How to improve the prediction based on citation impact percentiles for years shortly after the publication date?

    Full text link
    The findings of Bornmann, Leydesdorff, and Wang (in press) revealed that the consideration of journal impact improves the prediction of long-term citation impact. This paper further explores the possibility of improving citation impact measurements on the base of a short citation window by the consideration of journal impact and other variables, such as the number of authors, the number of cited references, and the number of pages. The dataset contains 475,391 journal papers published in 1980 and indexed in Web of Science (WoS, Thomson Reuters), and all annual citation counts (from 1980 to 2010) for these papers. As an indicator of citation impact, we used percentiles of citations calculated using the approach of Hazen (1914). Our results show that citation impact measurement can really be improved: If factors generally influencing citation impact are considered in the statistical analysis, the explained variance in the long-term citation impact can be much increased. However, this increase is only visible when using the years shortly after publication but not when using later years.Comment: Accepted for publication in the Journal of Informetrics. arXiv admin note: text overlap with arXiv:1306.445

    Investigating the interplay between fundamentals of national research systems: performance, investments and international collaborations

    Get PDF
    We discuss, at the macro-level of nations, the contribution of research funding and rate of international collaboration to research performance, with important implications for the science of science policy. In particular, we cross-correlate suitable measures of these quantities with a scientometric-based assessment of scientific success, studying both the average performance of nations and their temporal dynamics in the space defined by these variables during the last decade. We find significant differences among nations in terms of efficiency in turning (financial) input into bibliometrically measurable output, and we confirm that growth of international collaboration positively correlate with scientific success, with significant benefits brought by EU integration policies. Various geo-cultural clusters of nations naturally emerge from our analysis. We critically discuss the possible factors that potentially determine the observed patterns

    Evaluating a Departmentā€™s Research: Testing the Leiden Methodology in Business and Management

    Get PDF
    The Leiden methodology (LM), also sometimes called the ā€œcrown indicatorā€, is a quantitative method for evaluating the research quality of a research group or academic department based on the citations received by the group in comparison to averages for the field. There have been a number of applications but these have mainly been in the hard sciences where the data on citations, provided by the ISI Web of Science (WoS), is more reliable. In the social sciences, including business and management, many journals and books are not included within WoS and so the LM has not been tested here. In this research study the LM has been applied on a dataset of over 3000 research publications from three UK business schools. The results show that the LM does indeed discriminate between the schools, and has a degree of concordance with other forms of evaluation, but that there are significant limitations and problems within this discipline

    Cell line name recognition in support of the identification of synthetic lethality in cancer from text

    Get PDF
    Motivation: The recognition and normalization of cell line names in text is an important task in biomedical text mining research, facilitating for instance the identification of synthetically lethal genes from the literature. While several tools have previously been developed to address cell line recognition, it is unclear whether available systems can perform sufficiently well in realistic and broad-coverage applications such as extracting synthetically lethal genes from the cancer literature. In this study, we revisit the cell line name recognition task, evaluating both available systems and newly introduced methods on various resources to obtain a reliable tagger not tied to any specific subdomain. In support of this task, we introduce two text collections manually annotated for cell line names: the broad-coverage corpus Gellus and CLL, a focused target domain corpus. Results: We find that the best performance is achieved using NERsuite, a machine learning system based on Conditional Random Fields, trained on the Gellus corpus and supported with a dictionary of cell line names. The system achieves an F-score of 88.46% on the test set of Gellus and 85.98% on the independently annotated CLL corpus. It was further applied at large scale to 24 302 102 unannotated articles, resulting in the identification of 5 181 342 cell line mentions, normalized to 11 755 unique cell line database identifiers

    Geodesics in Heat

    Full text link
    We introduce the heat method for computing the shortest geodesic distance to a specified subset (e.g., point or curve) of a given domain. The heat method is robust, efficient, and simple to implement since it is based on solving a pair of standard linear elliptic problems. The method represents a significant breakthrough in the practical computation of distance on a wide variety of geometric domains, since the resulting linear systems can be prefactored once and subsequently solved in near-linear time. In practice, distance can be updated via the heat method an order of magnitude faster than with state-of-the-art methods while maintaining a comparable level of accuracy. We provide numerical evidence that the method converges to the exact geodesic distance in the limit of refinement; we also explore smoothed approximations of distance suitable for applications where more regularity is required
    • ā€¦
    corecore