28 research outputs found

    The impact of author name disambiguation on knowledge discovery from large-scale scholarly data

    Get PDF
    In this study, I demonstrate that the choice of disambiguation methods for resolving author name ambiguity can adversely affect our understanding of scholarly collaboration patterns and coauthorship network structures extracted from large-scale scholarly data. By utilizing large-scale bibliometric data, scholars in many fields have gleaned knowledge for use in scholarly evaluation, collaborator recommendations, research policy evaluation, and network-evolution modeling. A common challenge has been that author names in bibliometric data are not properly disambiguated: authors may share the same name (i.e., different authors are sometimes misrepresented to be a single author which can lead to a “merging of identities”). In addition, one author may use name variations (i.e., an author may be represented as two or more different authors which can lead to a “splitting of identities”). When faced with these challenges, most scholars have pre-processed bibliometric data using simple heuristics (e.g., if two author names share the same surname and given name initials, they are presumed to represent the same author identity) and assumed that their findings are robust to errors due to author name ambiguity. I test this long-held assumption in bibliometrics by measuring the impact of author name ambiguity on network properties. I accomplish this under varying conditions, including network size and cumulative time window (from 1991 to 2009) using four large-scale bibliometric datasets that cover: biomedicine, computer science, psychology and neuroscience, and one nation’s entire domestic publication output. For this task, I collate the statistical properties of coauthorship networks constructed from algorithmically disambiguated data (i.e., close to clean data) against those that come from the same networks, but are compromised by misidentified authors via first-initial and all-initials disambiguation methods. In addition, I simulate the levels of merging and splitting incrementally using those empirical datasets. My findings show that initial-based name disambiguation methods can severely distort our understanding of given networks and such distortion gets worse over time. Moreover, the distortion sometimes leads to biased or false knowledge of coauthorship network formation and evolution mechanisms such as preferential attachment generating the power-law distribution of vertex degree and to false validation of theories about the choice of collaborators in scientific research. This may result in ill-informed decisions about research policy and resource allocation. Besides measuring the impact of name ambiguity on network properties, I also test how name ambiguity can be estimated using simple heuristics such as dataset size and how merged author identities can be detected via an author’s ego-network properties to provide a practical guidance for corrective measures. My research calls for further studying the effects of author name ambiguity on coauthorship network properties and is expected to help scholars establish better practices for knowledge discovery from large-scale scholarly data

    The Geography of Scientific Collaboration

    Get PDF
    Science is increasingly defined by multidimensional collaborative networks. Despite the unprecedented growth of scientific collaboration around the globe – the collaborative turn – geography still matters for the cognitive enterprise. This book explores how geography conditions scientific collaboration and how collaboration affects the spatiality of science. This book offers a complex analysis of the spatial aspects of scientific collaboration, addressing the topic at a number of levels: individual, organizational, urban, regional, national, and international. Spatial patterns of scientific collaboration are analysed along with their determinants and consequences. By combining a vast array of approaches, concepts, and methodologies, the volume offers a comprehensive theoretical framework for the geography of scientific collaboration. The examples of scientific collaboration policy discussed in the book are taken from the European Union, the United States, and China. Through a number of case studies the authors analyse the background, development and evaluation of these policies. This book will be of interest to researchers in diverse disciplines such as regional studies, scientometrics, R&D policy, socio-economic geography and network analysis. It will also be of interest to policymakers, and to managers of research organisations

    The Geography of Scientific Collaboration

    Get PDF
    Science is increasingly defined by multidimensional collaborative networks. Despite the unprecedented growth of scientific collaboration around the globe – the collaborative turn – geography still matters for the cognitive enterprise. This book explores how geography conditions scientific collaboration and how collaboration affects the spatiality of science. This book offers a complex analysis of the spatial aspects of scientific collaboration, addressing the topic at a number of levels: individual, organizational, urban, regional, national, and international. Spatial patterns of scientific collaboration are analysed along with their determinants and consequences. By combining a vast array of approaches, concepts, and methodologies, the volume offers a comprehensive theoretical framework for the geography of scientific collaboration. The examples of scientific collaboration policy discussed in the book are taken from the European Union, the United States, and China. Through a number of case studies the authors analyse the background, development and evaluation of these policies. This book will be of interest to researchers in diverse disciplines such as regional studies, scientometrics, R&D policy, socio-economic geography and network analysis. It will also be of interest to policymakers, and to managers of research organisations

    Collaboration - changing the global landscape of science: proceedings of 10th International Conference on Webometrics, Informetrics and Scientometrics & 15th COLLNET Meeting 2014, September 3 - 5, 2014, Technische Universität Ilmenau, Germany

    Get PDF
    The 10th WIS encourages continued investigation into the field of applied scientometrics. The broad focus of the conference is on collaboration and communication in science and technology, science policy, quantitative aspects of science and combination and integration of qualitative and quantitative approaches in study of scientific practices. The conference thus aims to contribute to evidence-based and informed knowledge about scientific research and practices witch in turn may further provide input to institutional, regional, national and international research and innovation policy making

    Study on open science: The general state of the play in Open Science principles and practices at European life sciences institutes

    Get PDF
    Nowadays, open science is a hot topic on all levels and also is one of the priorities of the European Research Area. Components that are commonly associated with open science are open access, open data, open methodology, open source, open peer review, open science policies and citizen science. Open science may a great potential to connect and influence the practices of researchers, funding institutions and the public. In this paper, we evaluate the level of openness based on public surveys at four European life sciences institute

    Topics and institutions in the reproduction of intersectional inequalities in science

    Get PDF

    Walking the Talk: Toward a Values-Aligned Academy

    Get PDF
    Walking the Talk: Toward a Values-Aligned Academy is the culmination of 18 months of research interviews across the Big Ten Academic Alliance (BTAA). Conducted by the HuMetricsHSS Initiative as an extension of their previous work on values-enacted scholarly practice, the interviews focused on current systems of evaluation within BTAA institutions, the potential problems and inequalities of those processes, the kinds of scholarly work that could be better recognized and rewarded, and the contexts and pressures evaluators are under, including, as the process progressed, the onset and ongoing conditions of COVID-19. The interviews focused primarily on the reappointment, promotion, and tenure (RPT) process. Interviewees outlined a number of issues to be addressed, including toxicity in evaluation, scholars’ increased alienation from the work they are passionate about, and a high-level virtue-signaling of values by institutions without the infrastructure or resources to support the enactment of those values. Based on these conversations, this white paper offers a set of recommendations for making wide-scale change to address systematic injustice, erasure, and devaluation of academic labor in order to strengthen the positive public impact of scholarship

    Systematic Analysis of the Factors Contributing to the Variation and Change of the Microbiome

    Get PDF
    abstract: Understanding changes and trends in biomedical knowledge is crucial for individuals, groups, and institutions as biomedicine improves people’s lives, supports national economies, and facilitates innovation. However, as knowledge changes what evidence illustrates knowledge changes? In the case of microbiome, a multi-dimensional concept from biomedicine, there are significant increases in publications, citations, funding, collaborations, and other explanatory variables or contextual factors. What is observed in the microbiome, or any historical evolution of a scientific field or scientific knowledge, is that these changes are related to changes in knowledge, but what is not understood is how to measure and track changes in knowledge. This investigation highlights how contextual factors from the language and social context of the microbiome are related to changes in the usage, meaning, and scientific knowledge on the microbiome. Two interconnected studies integrating qualitative and quantitative evidence examine the variation and change of the microbiome evidence are presented. First, the concepts microbiome, metagenome, and metabolome are compared to determine the boundaries of the microbiome concept in relation to other concepts where the conceptual boundaries have been cited as overlapping. A collection of publications for each concept or corpus is presented, with a focus on how to create, collect, curate, and analyze large data collections. This study concludes with suggestions on how to analyze biomedical concepts using a hybrid approach that combines results from the larger language context and individual words. Second, the results of a systematic review that describes the variation and change of microbiome research, funding, and knowledge are examined. A corpus of approximately 28,000 articles on the microbiome are characterized, and a spectrum of microbiome interpretations are suggested based on differences related to context. The collective results suggest the microbiome is a separate concept from the metagenome and metabolome, and the variation and change to the microbiome concept was influenced by contextual factors. These results provide insight into how concepts with extensive resources behave within biomedicine and suggest the microbiome is possibly representative of conceptual change or a preview of new dynamics within science that are expected in the future.Dissertation/ThesisDoctoral Dissertation Biology 201

    Theories of Informetrics and Scholarly Communication

    Get PDF
    Scientometrics have become an essential element in the practice and evaluation of science and research, including both the evaluation of individuals and national assessment exercises. Yet, researchers and practitioners in this field have lacked clear theories to guide their work. As early as 1981, then doctoral student Blaise Cronin published "The need for a theory of citing" —a call to arms for the fledgling scientometric community to produce foundational theories upon which the work of the field could be based. More than three decades later, the time has come to reach out the field again and ask how they have responded to this call. This book compiles the foundational theories that guide informetrics and scholarly communication research. It is a much needed compilation by leading scholars in the field that gathers together the theories that guide our understanding of authorship, citing, and impact
    corecore