60 research outputs found
Citation models and research evaluation
Citations in science are being studied from several perspectives. On the one
hand, there are approaches such as scientometrics and the science of science,
which take a more quantitative perspective. In this chapter I briefly review
some of the literature on citations, citation distributions and models of
citations. These citations feature prominently in another part of the
literature which is dealing with research evaluation and the role of metrics
and indicators in that process. Here I briefly review part of the discussion in
research evaluation. This also touches on the subject of how citations relate
to peer review. Finally, I try to integrate the two literatures with the aim of
clarifying what I believe the two can learn from each other. The fundamental
problem in research evaluation is that research quality is unobservable. This
has consequences for conclusions that we can draw from quantitative studies of
citations and citation models. The term "indicators" is a relevant concept in
this context, which I try to clarify. Causality is important for properly
understanding indicators, especially when indicators are used in practice: when
we act on indicators, we enter causal territory. Even when an indicator might
have been valid, through its very use, the consequences of its use may
invalidate it. By combining citation models with proper causal reasoning and
acknowledging the fundamental problem about unobservable research quality, we
may hope to make progress.Comment: This is a draft. The final version will be available in Handbook of
Computational Social Science edited by Taha Yasseri, forthcoming 2023, Edward
Elgar Publishing Lt
Inferring the causal effect of journals on citations
Articles in high-impact journals are, on average, more frequently cited. But
are they cited more often because those articles are somehow more "citable"? Or
are they cited more often simply because they are published in a high-impact
journal? Although some evidence suggests the latter, the causal relationship is
not clear. We here compare citations of preprints to citations of the published
version to uncover the causal mechanism. We build on an earlier model of
citation dynamics to infer the causal effect of journals on citations. We find
that high-impact journals select articles that tend to attract more citations.
At the same time, we find that high-impact journals augment the citation rate
of published articles. Our results yield a deeper understanding of the role of
journals in the research system. The use of journal metrics in research
evaluation has been increasingly criticized in recent years and article-level
citations are sometimes suggested as an alternative. Our results show that
removing impact factors from evaluation does not negate the influence of
journals. This insight has important implications for changing practices of
research evaluation
Significant Scales in Community Structure
Many complex networks show signs of modular structure, uncovered by community
detection. Although many methods succeed in revealing various partitions, it
remains difficult to detect at what scale some partition is significant. This
problem shows foremost in multi-resolution methods. We here introduce an
efficient method for scanning for resolutions in one such method. Additionally,
we introduce the notion of "significance" of a partition, based on subgraph
probabilities. Significance is independent of the exact method used, so could
also be applied in other methods, and can be interpreted as the gain in
encoding a graph by making use of a partition. Using significance, we can
determine "good" resolution parameters, which we demonstrate on benchmark
networks. Moreover, optimizing significance itself also shows excellent
performance. We demonstrate our method on voting data from the European
Parliament. Our analysis suggests the European Parliament has become
increasingly ideologically divided and that nationality plays no role.Comment: To appear in Scientific Report
Detecting communities using asymptotical Surprise
Nodes in real-world networks are repeatedly observed to form dense clusters,
often referred to as communities. Methods to detect these groups of nodes
usually maximize an objective function, which implicitly contains the
definition of a community. We here analyze a recently proposed measure called
surprise, which assesses the quality of the partition of a network into
communities. In its current form, the formulation of surprise is rather
difficult to analyze. We here therefore develop an accurate asymptotic
approximation. This allows for the development of an efficient algorithm for
optimizing surprise. Incidentally, this leads to a straightforward extension of
surprise to weighted graphs. Additionally, the approximation makes it possible
to analyze surprise more closely and compare it to other methods, especially
modularity. We show that surprise is (nearly) unaffected by the well known
resolution limit, a particular problem for modularity. However, surprise may
tend to overestimate the number of communities, whereas they may be
underestimated by modularity. In short, surprise works well in the limit of
many small communities, whereas modularity works better in the limit of few
large communities. In this sense, surprise is more discriminative than
modularity, and may find communities where modularity fails to discern any
structure
Narrow scope for resolution-limit-free community detection
Detecting communities in large networks has drawn much attention over the
years. While modularity remains one of the more popular methods of community
detection, the so-called resolution limit remains a significant drawback. To
overcome this issue, it was recently suggested that instead of comparing the
network to a random null model, as is done in modularity, it should be compared
to a constant factor. However, it is unclear what is meant exactly by
"resolution-limit-free", that is, not suffering from the resolution limit.
Furthermore, the question remains what other methods could be classified as
resolution-limit-free. In this paper we suggest a rigorous definition and
derive some basic properties of resolution-limit-free methods. More
importantly, we are able to prove exactly which class of community detection
methods are resolution-limit-free. Furthermore, we analyze which methods are
not resolution-limit-free, suggesting there is only a limited scope for
resolution-limit-free community detection methods. Finally, we provide such a
natural formulation, and show it performs superbly
Metrics and peer review agreement at the institutional level
In the past decades, many countries have started to fund academic
institutions based on the evaluation of their scientific performance. In this
context, peer review is often used to assess scientific performance.
Bibliometric indicators have been suggested as an alternative. A recurrent
question in this context is whether peer review and metrics tend to yield
similar outcomes. In this paper, we study the agreement between bibliometric
indicators and peer review at the institutional level. Additionally, we also
quantify the internal agreement of peer review at the institutional level. We
find that the level of agreement is generally higher at the institutional level
than at the publication level. Overall, the agreement between metrics and peer
review is on par with the internal agreement among two reviewers for certain
fields of science. This suggests that for some fields, bibliometric indicators
may possibly be considered as an alternative to peer review for national
research assessment exercises
Community detection in networks with positive and negative links
Detecting communities in complex networks accurately is a prime challenge,
preceding further analyses of network characteristics and dynamics. Until now,
community detection took into account only positively valued links, while many
actual networks also feature negative links. We extend an existing Potts model
to incorporate negative links as well, resulting in a method similar to the
clustering of signed graphs, as dealt with in social balance theory, but more
general. To illustrate our method, we applied it to a network of international
alliances and disputes. Using data from 1993--2001, it turns out that the world
can be divided into six power blocs similar to Huntington's civilizations, with
some notable exceptions.Comment: 7 pages, 2 figures. Revised versio
Router-level community structure of the Internet Autonomous Systems
The Internet is composed of routing devices connected between them and
organized into independent administrative entities: the Autonomous Systems. The
existence of different types of Autonomous Systems (like large connectivity
providers, Internet Service Providers or universities) together with
geographical and economical constraints, turns the Internet into a complex
modular and hierarchical network. This organization is reflected in many
properties of the Internet topology, like its high degree of clustering and its
robustness.
In this work, we study the modular structure of the Internet router-level
graph in order to assess to what extent the Autonomous Systems satisfy some of
the known notions of community structure. We show that the modular structure of
the Internet is much richer than what can be captured by the current community
detection methods, which are severely affected by resolution limits and by the
heterogeneity of the Autonomous Systems. Here we overcome this issue by using a
multiresolution detection algorithm combined with a small sample of nodes. We
also discuss recent work on community structure in the light of our results
Perspectives on scientific error
Theoretical arguments and empirical investigations indicate that a high proportion of published findings do not replicate and are likely false. The current position paper provides a broad perspective on scientific error, which may lead to replication failures. This broad perspective focuses on reform history and on opportunities for future reform. We organize our perspective along four main themes: institutional reform, methodological reform, statistical reform and publishing reform. For each theme, we illustrate potential errors by narrating the story of a fictional researcher during the research cycle. We discuss future opportunities for reform. The resulting agenda provides a resource to usher in an era that is marked by a research culture that is less error-prone and a scientific publication landscape with fewer spurious findings
- …