73 research outputs found

    The h-index as an almost-exact function of some basic statistics

    Get PDF
    As is known, the h-index, h, is an exact function of the citation pattern. At the same time, and more generally, it is recognized that h is "loosely" related to the values of some basic statistics, such as the number of publications and the number of citations. In the present study we introduce a formula that expresses the h-index as an almost-exact function of some (four) basic statistics. On the basis of an empirical study-in which we consider citation data obtained from two different lists of journals from two quite different scientific fields-we provide evidence that our ready-to-use formula is able to predict the h-index very accurately (at least for practical purposes). For comparative reasons, alternative estimators of the h-index have been considered and their performance evaluated by drawing on the same dataset. We conclude that, in addition to its own interest, as an effective proxy representation of the h-index, the formula introduced may provide new insights into "factors" determining the value of the h-index, and how they interact with each other.Web of Science11321228120

    A new bibliometric index based on the shape of the citation distribution

    Get PDF
    In order to improve the h-index in terms of its accuracy and sensitivity to the form of the citation distribution, we propose the new bibliometric index . The basic idea is to define, for any author with a given number of citations, an “ideal” citation distribution which represents a benchmark in terms of number of papers and number of citations per publication, and to obtain an index which increases its value when the real citation distribution approaches its ideal form. The method is very general because the ideal distribution can be defined differently according to the main objective of the index. In this paper we propose to define it by a “squared-form” distribution: this is consistent with many popular bibliometric indices, which reach their maximum value when the distribution is basically a “square”. This approach generally rewards the more regular and reliable researchers, and it seems to be especially suitable for dealing with common situations such as applications for academic positions. To show the advantages of the -index some mathematical properties are proved and an application to real data is proposed.Web of Science912art. no. e11596

    A theoretical model of the relationship between the h-index and other simple citation indicators

    Get PDF
    Of the existing theoretical formulas for the h-index, those recently suggested by Burrell (J Informetr 7: 774-783, 2013b) and by Bertoli-Barsotti and Lando (J Informetr 9(4): 762-776, 2015) have proved very effective in estimating the actual value of the h-index Hirsch (Proc Natl Acad Sci USA 102: 16569-16572, 2005), at least at the level of the individual scientist. These approaches lead (or may lead) to two slightly different formulas, being based, respectively, on a "standard'' and a "shifted'' version of the geometric distribution. In this paper, we review the genesis of these two formulas-which we shall call the "basic'' and "improved'' Lambert-W formula for the h-index-and compare their effectiveness with that of a number of instances taken from the well-known Glanzel-Schubert class of models for the h-index (based, instead, on a Paretian model) by means of an empirical study. All the formulas considered in the comparison are "ready-to-use'', i.e., functions of simple citation indicators such as: the total number of publications; the total number of citations; the total number of cited paper; the number of citations of the most cited paper. The empirical study is based on citation data obtained from two different sets of journals belonging to two different scientific fields: more specifically, 231 journals from the area of "Statistics and Mathematical Methods'' and 100 journals from the area of "Economics, Econometrics and Finance'', totaling almost 100,000 and 20,000 publications, respectively. The citation data refer to different publication/citation time windows, different types of "citable'' documents, and alternative approaches to the analysis of the citation process ("prospective'' and "retrospective''). We conclude that, especially in its improved version, the Lambert-W formula for the h-index provides a quite robust and effective ready-to-use rule that should be preferred to other known formulas if one's goal is (simply) to derive a reliable estimate of the h-index.Web of Science11131448141

    Comparison of two bias reduction techniques for the Rasch model

    Get PDF
    This study examines the effect of two different techniques of bias reduction in the case of the fixed persons-fixed items formulation of the Rasch model. A first approach can be considered “corrective”, because it consists simply in correcting ex-post the joint maximum likelihood estimates by a factor (m-1)/m, were m represents the number of items and/or persons. A second approach, which is an application of a quite general formula for reducing the maximum likelihood estimation bias, can be considered “preventive”, because it arises from a modification of the score function. A comparative study of these two techniques was done using simulated data

    A modified minimum divergence estimator: some preliminary results for the Rasch model

    Get PDF
    Since its introduction, the joint maximum likelihood (JML) has been widely used as an estimation method for Rasch measurement models. As is well known, when the JML method is used, all item and person parame- ters are regarded as unknowns to be estimated. In this paper we focus on some drawbacks of the JML for the Rasch model: viz. i) the occasional non-existence of estimates, and ii) the bias of item parameter estimates. We propose a new estimation method which is based on the Minimum Divergence Estimation approach and consists in appropriately modifying the empirical distribution function. We provide empirical evidence that this method can solve the problem of the non-existence of the estimates and, at the same time, can reduce the bias of item parameter estimates compared to those obtained with both traditional JML estimation and the (k − 1)/k correction factor (where k is the number of items) commonly applied in JML software.

    An order-preserving property of the maximum likelihood estimates for the Rasch model

    No full text
    The paper proves that the maximum likelihood estimates of item and person parameters of the Rasch model preserve the order of item and person total scores. The result is valid for conditional and unconditional maximum likelihood estimation.Rasch model Maximum likelihood estimation Conditional maximum likelihood Arrangement increasing functions Monotone likelihood ratio

    Measuring the citation impact of journals with generalized Lorenz curves

    No full text
    To improve comparisons of journals, which are typically based on single-value indicators, such as the journal impact factor (JIF), this paper proposes a functional approach. We discuss interpretatively three progressively finer dominance relations. The first one corresponds to a comparison between the quantile functions of the citation distributions. The second one consists in comparing the integrals of the quantile functions namely, the generalized Lorenz curves (GLCs). The third one consists in comparing the integrals of the GLCs, where the integration is designed to emphasize the role of the "central body" of the articles of the journal. Although dominance relations are generally not complete orders, we demonstrate with an empirical analysis that it is possible to increase significantly the proportion of pairs of journals that are comparable by moving from the first to the second criterion, and then from the second to the third. Because, in practical applications, it may be convenient to reduce such a functional comparison to a scalar comparison between indicators, we follow an axiomatic approach to identify classes of indicators that are isotonic with the criteria introduced. We demonstrate that the established JIF may be usefully improved if it is corrected simply by multiplying it by one minus the Gini coefficient. The resulting index, defined as stabilized-JIF, has many attractive features and it is isotonic with all the dominance relations introduced.Web of Science11370368

    How mean rank and mean size may determine the generalised Lorenz curve: With application to citation analysis

    No full text
    Within the wide framework of information production processes, we present a conversion formula that expresses the generalised Lorenz (GL) curve of a size-frequency distribution as a function of the corresponding rank-size distribution using a fully discrete modelling approach. Based on this conversion formula, we introduce a somewhat universal model for the GL curve of the empirical size-frequency distribution. This study's approach to determining the GL curve is indirect, as we obtain our model for the size-frequency framework by modelling the rank-size distribution and not by directly modelling the size distribution or the GL curve itself, as is usually done. Our GL curve model is particularly appealing because it provides a simple and economical description of the distribution that depends on only three quantities: the (i) mean size, (ii) mean rank, and (iii) maximal rank. The model's performance in predicting the shape of the empirical GL curve is illustrated through a case study involving citation analysis.Web of Science13139638

    Modelling Missingness with a Rasch-Type model

    No full text
    International audienceIn this paper we focus on a model-based approach to the treatment of missing data due to examinées' nonresponse, in the context of Item Response Theory (IRT). With model-based approach we mean that item nonresponses are to be included in the analysisindeed we assume that nonresponses are caused by a spécifie latent trait, summarizing the response propensity of the examinée. Then, the idea is to postulate the existence of two latent traits: one for response propensity and the other for ability/proficiency. Different models hâve been proposed in the literature. In this paper, a new class of multidimensional IRT models, called Rasch-Rasch models, is introduced. The Rasch-Rasch model belongs to the wider class of Rasch modelsand, as a member of the exponential family, it can be viewed as a generalized linear mixed model. Real and artificial datasets are used to illustrate the characteristics of this new model
    corecore