1,220 research outputs found
The relation between Pearson's correlation coefficient r and Salton's cosine measure
The relation between Pearson's correlation coefficient and Salton's cosine
measure is revealed based on the different possible values of the division of
the L1-norm and the L2-norm of a vector. These different values yield a sheaf
of increasingly straight lines which form together a cloud of points, being the
investigated relation. The theoretical results are tested against the author
co-citation relations among 24 informetricians for whom two matrices can be
constructed, based on co-citations: the asymmetric occurrence matrix and the
symmetric co-citation matrix. Both examples completely confirm the theoretical
results. The results enable us to specify an algorithm which provides a
threshold value for the cosine above which none of the corresponding Pearson
correlations would be negative. Using this threshold value can be expected to
optimize the visualization of the vector space
The w-index: A significant improvement of the h-index
I propose a new measure, the w-index, as a particularly simple and useful way
to assess the integrated impact of a researcher's work, especially his or her
excellent papers. The w-index can be defined as follows: If w of a researcher's
papers have at least 10w citations each and the other papers have fewer than
10(w+1) citations, his/her w-index is w. It is a significant improvement of the
h-index.Comment: 7 pages, 3 tables, small changes from v
A case study of the Hirsch index for 26 non-prominent physicists
The h index was introduced by Hirsch to quantify an individual's scientific
research output. It has been widely used in different fields to show the
relevance of the research work of prominent scientists. I have worked out 26
practical cases of physicists which are not so prominent. Therefore this case
study should be more relevant to discuss various features of the Hirsch index
which are interesting or disturbing or both for the more average situation. In
particular, I investigate quantitatively some pitfalls in the evaluation and
the influence of self-citations.Comment: 13 pages, 3 figures, updated after extensive language editing, no
other changes to first versio
Equalities between h-type indices and definitions of rational h-type indicators
Purpose: To show for which publication-citation arrays h-type indices are equal and to reconsider rational h-type indices. Results for these research questions fill some gaps in existing basic knowledge about h-type indices.
Design/methodology/approach: The results and introduction of new indicators are based on well-known definitions.
Findings: The research purpose has been reached: answers to the first questions are obtained and new indicators are defined.
Research limitations: h-type indices do not meet the Bouyssou-Marchant independence requirement.
Practical implications: On the one hand, more insight has been obtained for well-known indices such as the h-and the g-index and on the other hand, simple extensions of existing indicators have been added to the bibliometric toolbox. Relative rational h-type indices are more useful for individuals than the existing absolute ones.
Originality/value: Answers to basic questions such as "when are the values of two h-type indices equal" are provided. A new rational h-index is introduced
The citation triad: an overview of a scientist's publication output based on Ferrers diagrams
In a recent work by Anderson, Hankin, and Killworth (2008), Ferrers diagrams and Durfee squares are used to represent the scientific output of a scientist and construct a new h-based bibliometric indicator, the tapered h-index (hT). In the first part of this paper we examine hT, identifying its main drawbacks and weaknesses: an arbitrary scoring system and an illusory increase in discrimination power compared to h. Subsequently,wepropose a new bibliometric tool, the citation triad (CT), that better exploits the information contained in a Ferrers diagram, giving a synthetic overview of a scientist's publication output. The advantages of this new approach are discussed in detail. Argument is supported by several examples based on empirical dat
Scientific impact evaluation and the effect of self-citations: mitigating the bias by discounting h-index
In this paper, we propose a measure to assess scientific impact that
discounts self-citations and does not require any prior knowledge on the their
distribution among publications. This index can be applied to both researchers
and journals. In particular, we show that it fills the gap of h-index and
similar measures that do not take into account the effect of self-citations for
authors or journals impact evaluation. The paper provides with two real-world
examples: in the former, we evaluate the research impact of the most productive
scholars in Computer Science (according to DBLP); in the latter, we revisit the
impact of the journals ranked in the 'Computer Science Applications' section of
SCImago. We observe how self-citations, in many cases, affect the rankings
obtained according to different measures (including h-index and ch-index), and
show how the proposed measure mitigates this effect
Zipf's law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia's research as an example
Slovenia's Current Research Information System (SICRIS) currently hosts
86,443 publications with citation data from 8,359 researchers working on the
whole plethora of social and natural sciences from 1970 till present. Using
these data, we show that the citation distributions derived from individual
publications have Zipfian properties in that they can be fitted by a power law
, with between 2.4 and 3.1 depending on the
institution and field of research. Distributions of indexes that quantify the
success of researchers rather than individual publications, on the other hand,
cannot be associated with a power law. We find that for Egghe's g-index and
Hirsch's h-index the log-normal form
applies best, with and depending moderately on the underlying set of
researchers. In special cases, particularly for institutions with a strongly
hierarchical constitution and research fields with high self-citation rates,
exponential distributions can be observed as well. Both indexes yield
distributions with equivalent statistical properties, which is a strong
indicator for their consistency and logical connectedness. At the same time,
differences in the assessment of citation histories of individual researchers
strengthen their importance for properly evaluating the quality and impact of
scientific output.Comment: 8 pages, 3 figures; accepted for publication in Journal of
Informetrics [supplementary material available at
http://www.matjazperc.com/sicris/stats.html
Growth and structure of Slovenia's scientific collaboration network
We study the evolution of Slovenia's scientific collaboration network from
1960 till present with a yearly resolution. For each year the network was
constructed from publication records of Slovene scientists, whereby two were
connected if, up to the given year inclusive, they have coauthored at least one
paper together. Starting with no more than 30 scientists with an average of 1.5
collaborators in the year 1960, the network to date consists of 7380
individuals that, on average, have 10.7 collaborators. We show that, in spite
of the broad myriad of research fields covered, the networks form "small
worlds" and that indeed the average path between any pair of scientists scales
logarithmically with size after the largest component becomes large enough.
Moreover, we show that the network growth is governed by near-liner
preferential attachment, giving rise to a log-normal distribution of
collaborators per author, and that the average starting year is roughly
inversely proportional to the number of collaborators eventually acquired.
Understandably, not all that became active early have till now gathered many
collaborators. We also give results for the clustering coefficient and the
diameter of the network over time, and compare our conclusions with those
reported previously.Comment: 10 pages, 3 figures; accepted for publication in Journal of
Informetrics [related work available at http://arxiv.org/abs/1003.1018 and
http://www.matjazperc.com/sicris/stats.html
Analysis of the Hirsch index's operational properties
The h-index is a relatively recent bibliometric indicator for assessing the research output of scientists, based on the publications and the corresponding citations. Due to the original characteristics of easy calculation and immediate intuitive meaning, this indicator has become very popular in the scientific community. Also, it received some criticism essentially because of its ââlow" accuracy. The contribution of this paper is to provide a detailed analysis of the h-index, from the point of view of the indicator operational properties. This work can be helpful to better understand the peculiarities and limits of h and avoid its misuse. Finally, we suggest an additional indicator Ă°f Ă that complements h with the information related to the publication age, not compromising the original simplicity and immediacy of understandin
Hirsch-type equations and bundles
We define Hirsch-type equations and bundles being common generalizations of
the defining equations of e.g. Hirsch-bundles, g-bundles and Kosmulski-bundles.
In this way, common properties of alle these bundles can be proved. The main
result proves basic inequalities for these bundles. They form the basis for
convergence results as well as for criteria for these bundles to be impact
bundles
- âŠ