16 research outputs found

    Words by the tail : assessing lexical diversity in scholarly titles using frequency-rank distribution tail fits

    Get PDF
    This research assesses the evolution of lexical diversity in scholarly titles using a new indicator based on zipfian frequency-rank distribution tail fits. At the operational level, while both head and tail fits of zipfian word distributions are more independent of corpus size than other lexical diversity indicators, the latter however neatly outperforms the former in that regard. This benchmark-setting performance of zipfian distribution tails proves extremely handy in distinguishing actual patterns in lexical diversity from the statistical noise generated by other indicators due to corpus size fluctuations. From an empirical perspective, analysis of Web of Science (WoS) article titles from 1975 to 2014 shows that the lexical concentration of scholarly titles in Natural Sciences & Engineering (NSE) and Social Sciences & Humanities (SSH) articles increases by a little less than 8% over the whole period. With the exception of the lexically concentrated Mathematics, Earth & Space, and Physics, NSE article titles all increased in lexical concentration, suggesting a probable convergence of concentration levels in the near future. As regards to SSH disciplines, aggregation effects observed at the disciplinary group level suggests that, behind the stable concentration levels of SSH disciplines, a cross-disciplinary homogenization of the highest word frequency ranks may be at work. Overall, these trends suggest a progressive standardization of title wording in scientific article titles, as article titles get written using an increasingly restricted and crossdisciplinary set of words

    Scientific structures in context : identification and use of structures, context, and new developments in science

    Get PDF
    The use and visualisation of structures in science (sets of related publications, authors, words) is investigated in a number of applications. We hold that the common ground of a field can explain the use and applicability of these structures.LEI Universiteit LeidenFSW - CWTS - Ou

    Modern Problems of Scientometric Assessment of Publication Activity

    Get PDF
    As is known, an objective assessment of scientific activity is one of the most difficult problems, in terms of the relationship within itself as well as with society. However, for many decades, the significance of scientists’ contribution to the development of the corresponding branch of science was assessed by the scientific community only by meaningful qualitative criteria, wherein the principle and mechanism of such an assessment was actually intuitive and defied quantitative description. That is why the urgent task was undertaken to create a system for evaluating scientific activity based on some objective indicators of the activity of a particular scientist; in search of such criteria, in the 1970s–1980s, the term “citation index” appeared. Although a close examination of this indicator revealed its limitations and in a number of cases even inadequacy in assessing scientific activity, it has nevertheless since the 1990s gained very wide popularity in the scientific community. This has contributed to the emergence of numerous works aimed at finding new and ideal indicators for assessing publication activity (so-called bibliometric indices). To date, several dozen such indices have been proposed, the most significant of which was the so-called Hirsch index or h-index. Nevertheless, despite the incredibly significant advances in this specific area of sociology, the above problem is still far from resolved. In this regard, the key task of this Special Issue is to familiarize its readers with the latest achievements both in the search for new, more advanced bibliometric indicators and in the improvement of existing ones

    The Intellectual Organisation of History

    Get PDF
    A tradition of scholarship discusses the characteristics of different areas of knowledge, in particular after modern academia compartmentalized them into disciplines. The academic approach is often put to question: are there two or more cultures? Is an ever-increasing specialization the only way to cope with information abundance or are holistic approaches helpful too? What is happening with the digital turn? If these questions are well studied for the sciences, our understanding of how the humanities might differ in their own respect is far less advanced. In particular, modern academia might foster specific patterns of specialization in the humanities. Eventually, the recent rise in the application of digital methods to research, known as the digital humanities, might be introducing structural adaptations through the development of shared research technologies and the advent of organizational practices such as the laboratory. It therefore seems timely and urgent to map the intellectual organization of the humanities. This investigation depends on few traits such as the level of codification, the degree of agreement among scholars, the level of coordination of their efforts. These characteristics can be studied by measuring their influence on the outcomes of scientific communication. In particular, this thesis focuses on history as a discipline using bibliometric methods. In order to explore history in its complexity, an approach to create collaborative citation indexes in the humanities is proposed, resulting in a new dataset comprising monographs, journal articles and citations to primary sources. Historians' publications were found to organize thematically and chronologically, sharing a limited set of core sources across small communities. Core sources act in two ways with respect to the intellectual organization: locally, by adding connectivity within communities, or globally as weak ties across communities. Over recent decades, fragmentation is on the rise in the intellectual networks of historians, and a comparison across a variety of specialisms from the human, natural and mathematical sciences revealed the fragility of such networks across the axes of citation and textual similarities. Humanists organize into more, smaller and scattered topical communities than scientists. A characterisation of history is eventually proposed. Historians produce new historiographical knowledge with a focus on evidence or interpretation. The former aims at providing the community with an agreed-upon factual resource. Interpretive work is instead mainly focused on creating novel perspectives. A second axe refers to two modes of exploration of new ideas: in-breadth, where novelty relates to adding new, previously unknown pieces to the mosaic, or in-depth, if novelty then happens by improving on previous results. All combinations possible, historians tend to focus on in-breadth interpretations, with the immediate consequence that growth accentuates intellectual fragmentation in the absence of further consolidating factors such as theory or technologies. Research on evidence might have a different impact by potentially scaling-up in the digital space, and in so doing influence the modes of interpretation in turn. This process is not dissimilar to the gradual rise in importance of research technologies and collaborative competition in the mathematical and natural sciences. This is perhaps the promise of the digital humanities

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

    Machine-actionable assessment of research data products

    Get PDF
    Research data management is a relevant topic for academic research which is why many concepts and technologies emerge to face the challenges involved, such as data growth, reproducibility, or heterogeneity of tools, services, and standards. The basic concept of research data management is a research data product; it has three dimensions: the data, the metadata describing them, and the services providing both. Traditionally, the assessment of a research data product has been carried out either manually via peer-review by human experts or automated by counting certain events. We present a novel mechanism to assess research data products. The current state-of-the-art of machine-actionable assessment of research data products is based on the assumption that its quality, impact, or relevance are linked to the likeliness of peers or others to interact with it: event-based metrics include counting citations, social media interactions, or usage statistics. The shortcomings of event-based metrics are systematically discussed in this thesis; they include dependance on the date of publication and the impact of social effects. In contrast to event-based metrics benchmarks for research data products simulate technical interactions with a research data product and check its compliance with best practices. Benchmarks operate on the assumption that the effort invested in producing a research data product increases the chances that its quality, impact, or relevance are high. This idea is translated into a software architecture and a step-by-step approach to create benchmarks based on it. For a proof-of-concept we use a prototypical benchmark on more than 795,000 research data products deposited at the Zenodo repository to showcase its effectiveness, even with many research data products. A comparison of the benchmark’s scores with event-based metrics indicate that benchmarks have the potential to complement event-based metrics and that both weakly correlate under certain circumstances. These findings provide the methodological basis for a new tool to answer scientometric questions and to support decision-making in the distribution of sparse resources. Future research can further explore those aspects of benchmarks that allow to improve the reproducibility of scientific findings.Dass das Management von Forschungsdaten ein relevantes Thema ist, zeigt sich an der Vielzahl an konzeptioneller und technischer Antworten auf die damit einhergehenden Herausforderungen, wie z.B. Datenwachstum, Reproduzierbarkeit oder Heterogenität der genutzten Tools, Dienste und Standards. Das Forschungsdatenprodukt ist in diesem Kontext ein grundlegender, dreiteilig aufgebauter Begriff: Daten, Metadaten und Dienste, die Zugriffe auf die beiden vorgenannten Komponenten ermöglichen. Die Beurteilung eines Forschungsdatenprodukts ist bisher händisch durch den Peer Review oder durch das Zählen von bestimmten Ereignissen realisiert. Der heutige Stand der Technik, um automatisiert Qualität, Impact oder Relevanz eines Forschungsdatenprodukts zu beurteilen, basiert auf der Annahme, dass diese drei Eigenschaften mit der Wahrscheinlichkeit von Interaktionen korrelieren. Event-basierte Metriken umfassen das Zählen von Zitationen, Interaktionen auf sozialen Medien oder technische Zugriffe. Defizite solcher Metriken werden in dieser Arbeit systematisch erörtert; besonderes Augenmerk wird dabei auf deren Zeitabhängigkeit und den Einfluss sozialer Mechanismen gelegt. Benchmarks sind Programme, die Interaktionen mit einem Forschungsdatenprodukt simulieren und dabei die Einhaltung guter Praxis prüfen. Benchmarks operieren auf der Annahme, dass der Aufwand, der in die Erzeugung und Wartung von Forschungsdatenprodukte investiert wurde, mit deren Qualität, Impact und Relevanz korreliert. Diese Idee wird in dieser Arbeit in eine Software-Architektur gegossen, für deren Implementierung geeignete Hilfsmittel bereitgestellt werden. Ein prototypischer Benchmark wird auf mehr als 795.000 Datensätzen des Zenodo Repositorys evaluiert, um die Effektivität der Architektur zu demonstrieren.Ein Vergleich zwischen Benchmark Scores und event-basierten Metriken legt nahe, dass beide unter bestimmten Umständen schwach korrelieren. Dieses Ergebnis rechtfertigt den Einsatz von Benchmarks als neues szientrometrisches Tool und als Entscheidungshilfe in der Verteilung knapper Ressourcen. Der Einsatz von Benchmarks in der Sicherstellung von reproduzierbaren wissenschaftlichen Erkenntnissen ist ein vielversprechender Gegenstand zukünftiger Forschung

    Computational Interdisciplinarity: A Study in the History of Science

    Get PDF
    abstract: This dissertation focuses on creating a pluralistic approach to understanding and measuring interdisciplinarity at various scales to further the study of the evolution of knowledge and innovation. Interdisciplinarity is considered an important research component and is closely linked to higher rates of innovation. If the goal is to create more innovative research, we must understand how interdisciplinarity operates. I begin by examining interdisciplinarity with a small scope, the research university. This study uses metadata to create co-authorship networks and examine how a change in university policies to increase interdisciplinarity can be successful. The New American University Initiative (NAUI) at Arizona State University (ASU) set forth the goal of making ASU a world hub for interdisciplinary research. This kind of interdisciplinarity is produced from a deliberate, engineered, reorganization of the individuals within the university and the knowledge they contain. By using a set of social network analysis measurements, I created an algorithm to measure the changes to the co-authorship networks that resulted from increased university support for interdisciplinary research. The second case study increases the scope of interdisciplinarity from individual universities to a single scientific discourse, the Anthropocene. The idea of the Anthropocene began as an idea about the need for a new geological epoch and underwent unsupervised interdisciplinary expansion due to climate change integrating itself into the core of the discourse. In contrast to the NAUI which was specifically engineered to increase interdisciplinarity, the I use keyword co-occurrence networks to measure how the Anthropocene discourse increases its interdisciplinarity through unsupervised expansion after climate change becomes a core keyword within the network and behaves as an anchor point for new disciplines to connect and join the discourse. The scope of interdisciplinarity increases again with the final case study about the field of evolutionary medicine. Evolutionary medicine is a case of engineered interdisciplinary integration between evolutionary biology and medicine. The primary goal of evolutionary medicine is to better understand "why we get sick" through the lens of evolutionary biology. This makes it an excellent candidate to understand large-scale interdisciplinarity. I show through multiple type of networks and metadata analyses that evolutionary medicine successfully integrates the concepts of evolutionary biology into medicine. By increasing our knowledge of interdisciplinarity at various scales and how it behaves in different initial conditions, we are better able to understand the elusive nature of innovation. Interdisciplinary can mean different things depending on how its defined. I show that a pluralistic approach to defining and measuring interdisciplinarity is not only appropriate but necessary if our goal is to increase interdisciplinarity, the frequency of innovations, and our understanding of the evolution of knowledge.Dissertation/ThesisDoctoral Dissertation Biology 201

    Unleashing the power of semantic text analysis: a complex systems approach

    Get PDF
    In the present information era, a huge amount of machine-readable data is available regarding scientific publications. Such unprecedented wealth of data offers the opportunity to investigate science itself as a complex interacting system by means of quantitative approaches. These kind of studies have the potential to provide new insights on the large-scale organization of science and the driving mechanisms underlying its evolution. A particularly important aspect of these data is the semantic information present within publications as it grants access to the concepts used by scientists to describe their findings. Nevertheless, the presence of the so-called buzzwords, \ie terms that are not specific and are used indistinctly in many contexts, hinders the emerging of the thematic organization of scientific articles. In this Thesis, I resume my original contribution to the problem of leveraging the semantic information contained in a corpus of documents. Specifically, I have developed an information-theoretic measure, based on the maximum entropy principle, to quantify the information content of scientific concepts. This measure provides an objective and powerful way to identify generic concepts acting as buzzwords, which increase the noise present in the semantic similarity between articles. I prove that the removal of generic concepts is beneficial in terms of the sparsity of the similarity network, thus allowing the detection of communities of articles that are related to more specific themes. The same effect is observed when describing the corpus of articles in terms of topics, namely clusters of concepts that compose the papers as a mixture. Moreover, I applied the method to a collection of web documents obtaining a similar effect despite their differences with scientific articles. Regarding the scientific knowledge, another important aspect I examine is the temporal evolution of the concept generality, as it may potentially describe typical patterns in the evolution of concepts that can highlight the way in which they are consumed over time

    The Evolutionary Dynamics of Discursive Knowledge

    Get PDF
    This open access book addresses three themes which have been central to Leydesdorff's research: (1) the dynamics of science, technology, and innovation; (2) the scientometric operationalization of these concept; and (3) the elaboration in terms of a Triple Helix of university-industry-government relations. In this study, I discuss the relations among these themes. Using Luhmann's social-systems theory for modelling meaning processing and Shannon's theory for information processing, I show that synergy can add new options to an innovation system as redundancy. The capacity to develop new options is more important for innovation than past performance. Entertaining a model of possible future states makes a knowledge-based system increasingly anticipatory. The trade-off between the incursion of future states on the historical developments can be measured using the Triple-Helix synergy indicator. This is shown, for example, for the Italian national and regional systems of innovation

    The Psychology of Fake News

    Get PDF
    This volume examines the phenomenon of fake news by bringing together leading experts from different fields within psychology and related areas, and explores what has become a prominent feature of public discourse since the first Brexit referendum and the 2016 US election campaign. Dealing with misinformation is important in many areas of daily life, including politics, the marketplace, health communication, journalism, education, and science. In a general climate where facts and misinformation blur, and are intentionally blurred, this book asks what determines whether people accept and share (mis)information, and what can be done to counter misinformation? All three of these aspects need to be understood in the context of online social networks, which have fundamentally changed the way information is produced, consumed, and transmitted. The contributions within this volume summarize the most up-to-date empirical findings, theories, and applications and discuss cutting-edge ideas and future directions of interventions to counter fake news. Also providing guidance on how to handle misinformation in an age of “alternative facts”, this is a fascinating and vital reading for students and academics in psychology, communication, and political science and for professionals including policy makers and journalists
    corecore