Search CORE

6 research outputs found

The relation between Pearson's correlation coefficient r and Salton's cosine measure

Author: Egghe Leo
Leydesdorff Loet
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

The relation between Pearson's correlation coefficient and Salton's cosine measure is revealed based on the different possible values of the division of the L1-norm and the L2-norm of a vector. These different values yield a sheaf of increasingly straight lines which form together a cloud of points, being the investigated relation. The theoretical results are tested against the author co-citation relations among 24 informetricians for whom two matrices can be constructed, based on co-citations: the asymmetric occurrence matrix and the symmetric co-citation matrix. Both examples completely confirm the theoretical results. The results enable us to specify an algorithm which provides a threshold value for the cosine above which none of the corresponding Pearson correlations would be negative. Using this threshold value can be expected to optimize the visualization of the vector space

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

Quantitative aspects of the management of the modern (scientific) library

Author: Egghe Leo
Publication venue
Publication date: 01/01/2004
Field of study

This paper and talk examines aspects of data collection for the management of a modern (scientific) library. We discuss: reports as a public relations and public awareness tool, norms and standards, data gathering and its problems in an electronic environment, indicators, complete and incomplete data (sampling) and their uses

Institutional Repository Universiteit Antwerpen

Utrecht University Repository

Expansion of the field of informetrics: The second special issue

Author: Egghe
L. Egghe
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Expansion of the field of informetrics: Origins and consequences

Author: Adamic
Albert
Antelman
Bar-Ilan
Barabási
Barabási
Björneborn
Blackert
Bossy
Bradford
Condon
Egghe
Egghe
Egghe
Egghe
Estoup
Glänzel
He
Hood
Huberman
Ikpaahindi
L. Egghe
Lawani
Lipetz
Lotka
Nacke
Nalimov
Perneger
Pritchard
Rousseau
Schubert
Summers
Tague-Sutcliffe
Zipf
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

How to Normalize Co-Occurrence Data? An Analysis of Some Well-Known Similarity Measures

Author: Ahlgren
Anderberg
Baulieu
Baulieu
Borg
Boyack
Braam
Braam
Callon
Chung
Church
Cox
Cox
De Solla Price
Drasgow
Egghe
Egghe
Egghe
Egghe
Glänzel
Glänzel
Gmür
Gower
Gower
Guilford
Hamers
Hardy
Heimeriks
Hinze
Hubálek
Janson
Jarneving
Jones
Klavans
Klavans
Kopcsa
Kostoff
Kostoff
Law
Leclerc
Leydesdorff
Leydesdorff
Leydesdorff
Luukkonen
Luukkonen
Manning
McCain
McCain
McCain
Morillo
Palmer
Peters
Peters
Peters
Qin
Rip
Rorvig
Rosenberg
Rosenberg
Salton
Salton
Schneider
Schneider
Schneider
Schubert
Simmen
Small
Small
Small
Small
Small
Small
Sokal
Tijssen
Tijssen
Tijssen
Van der Kloot
Van Eck
Van Eck
Van Eck
Van Raan
Vaughan
Vaughan
Waltman
White
White
White
Zegers
Zitt
Publication venue: Eck, N.J.P. (Nees Jan) van
Publication date: 01/01/2009
Field of study

In scientometric research, the use of co-occurrence data is very common. In many cases, a similarity measure is employed to normalize the data. However, there is no consensus among researchers on which similarity measure is most appropriate for normalization purposes. In this paper, we theoretically analyze the properties of similarity measures for co-occurrence data, focusing in particular on four well-known measures: the association strength, the cosine, the inclusion index, and the Jaccard index. We also study the behavior of these measures empirically. Our analysis reveals that there exist two fundamentally different types of similarity measures, namely set-theoretic measures and probabilistic measures. The association strength is a probabilistic measure, while the cosine, the inclusion index, and the Jaccard index are set-theoretic measures. Both our theoretical and our empirical results indicate that co-occurrence data can best be normalized using a probabilistic measure. This provides strong support for the use of the association strength in scientometric research

CiteSeerX

Crossref

EUR Research Repository

Erasmus University Digital Repository

Construction of weak and strong similarity measures for ordered sets of documents using fuzzy set techniques

Author: Egghe Leo
Michel C.
Publication venue
Publication date: 01/01/2003
Field of study

International audienc

HAL Descartes

HAL

Institutional Repository Universiteit Antwerpen