Search CORE

15 research outputs found

Peer review and citation data in predicting university rankings, a large-scale analysis

Author: A Baccini
AG Smith
CJ Lee
D Hicks
D Sayer
DL Anderson
GB Emerson
J Mingers
M Hojat
O Mryglod
R Smith
SE Hug
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/05/2018
Field of study

Most Performance-based Research Funding Systems (PRFS) draw on peer review and bibliometric indicators, two different method- ologies which are sometimes combined. A common argument against the use of indicators in such research evaluation exercises is their low corre- lation at the article level with peer review judgments. In this study, we analyse 191,000 papers from 154 higher education institutes which were peer reviewed in a national research evaluation exercise. We combine these data with 6.95 million citations to the original papers. We show that when citation-based indicators are applied at the institutional or departmental level, rather than at the level of individual papers, surpris- ingly large correlations with peer review judgments can be observed, up to r <= 0.802, n = 37, p < 0.001 for some disciplines. In our evaluation of ranking prediction performance based on citation data, we show we can reduce the mean rank prediction error by 25% compared to previous work. This suggests that citation-based indicators are sufficiently aligned with peer review results at the institutional level to be used to lessen the overall burden of peer review on national evaluation exercises leading to considerable cost savings

arXiv.org e-Print Archive

Crossref

Open Research Online (The Open University)

Microsoft academic: is the Phoenix getting wings?

Author: Alakangas S.
Alakangas S.
Harzing A.
Harzing A.
Publication venue: Springer
Publication date: 01/01/2017
Field of study

In this article, we compare publication and citation coverage of the new Microsoft Academic with all other major sources for bibliometric data: Google Scholar, Scopus, and the Web of Science, using a sample of 145 academics in five broad disciplinary areas: Life Sciences, Sciences, Engineering, Social Sciences, and Humanities. When using the more conservative linked citation counts for Microsoft Academic, this data-source provides higher citation counts than both Scopus and the Web of Science for Engineering, the Social Sciences, and the Humanities, whereas citation counts for the Life Sciences and the Sciences are fairly similar across these three databases. Google Scholar still reports the highest citation counts for all disciplines. When using the more liberal estimated citation counts for Microsoft Academic, its average citations counts are higher than both Scopus and the Web of Science for all disciplines. For the Life Sciences, Microsoft Academic estimated citation counts are higher even than Google Scholar counts, whereas for the Sciences they are almost identical. For Engineering, Microsoft Academic estimated citation counts are 14% lower than Google Scholar citation counts, whereas for the Social Sciences this is 23%. Only for the Humanities are they substantially (69%) lower than Google Scholar citations counts. Overall, this first large-scale comparative study suggests that the new incarnation of Microsoft Academic presents us with an excellent alternative for citation analysis. We therefore conclude that the Microsoft Academic Phoenix is undeniably growing wings; it might be ready to fly off and start its adult life in the field of research evaluation soon

Middlesex University Research Repository

Recommended from our members

Creating a Scholarly Knowledge Graph from Survey Article Tables

Author: Auer Sören
Oelen Allard
Stocker Markus
Publication venue: Cham : Springer
Publication date: 01/01/2020
Field of study

Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts

Repositorium für Naturwissenschaften und Technik

Microsoft Academic automatic document searches: accuracy for journal articles and suitability for citation analysis

Author: Delgado López-Cózar
Falagas
Franceschini
Franceschini
Franceschini
Franceschini
Franceschini
Franceschini
Franceschini
Halevi
Harzing
Harzing
Harzing
Herrmannova
Hug
Hug
Kousha
Mike Thelwall
Moed
Mongeon
Paszcza
Sinha
Sud
Thelwall
Thelwall
Thelwall
Thelwall
Thelwall
Thelwall
Waltman
Wilsdon
Wouters
Zitt
Publication venue: 'Elsevier BV'
Publication date: 23/11/2017
Field of study

Microsoft Academic is a free academic search engine and citation index that is similar to Google Scholar but can be automatically queried. Its data is potentially useful for bibliometric analysis if it is possible to search effectively for individual journal articles. This article compares different methods to find journal articles in its index by searching for a combination of title, authors, publication year and journal name and uses the results for the widest published correlation analysis of Microsoft Academic citation counts for journal articles so far. Based on 126,312 articles from 323 Scopus subfields in 2012, the optimal strategy to find articles with DOIs is to search for them by title and filter out those with incorrect DOIs. This finds 90% of journal articles. For articles without DOIs, the optimal strategy is to search for them by title and then filter out matches with dissimilar metadata. This finds 89% of journal articles, with an additional 1% incorrect matches. The remaining articles seem to be mainly not indexed by Microsoft Academic or indexed with a different language version of their title. From the matches, Scopus citation counts and Microsoft Academic counts have an average Spearman correlation of 0.95, with the lowest for any single field being 0.63. Thus, Microsoft Academic citation counts are almost universally equivalent to Scopus citation counts for articles that are not recent but there are national biases in the results

arXiv.org e-Print Archive

Crossref

Wolverhampton Intellectual Repository and E-theses

Mining Academic Publications to Predict Automation

Author: Doty Elena A
Publication venue: Dartmouth Digital Commons
Publication date: 01/06/2020
Field of study

This paper proposes a novel framework of predicting future technological change. Using abstracts of academic publications available in the Microsoft Academic graph, co-occurrence matrices are generated to indicate how often occupation and technological terms are referenced together. This matrices are used in linear regression models to predict future co-occurrence of occupations and technologies with a relatively high degree of accuracy as measured through the mean squared error of the models. While this work is unable to link the co-occurrences found in academic publications to automation in the labor force due to a dearth of automation data, future work conducted when such data is available could apply a similar approach with the aim of predicting automation from trends in academic research and publications

Dartmouth Digital Commons (Dartmouth College)

The lost academic home: institutional affiliation links in Google Scholar Citations

Author: Ayllón Juan M
Delgado-López-Cózar Emilio
Martín-Martín Alberto
Orduña Malea Enrique
Publication venue: 'Emerald'
Publication date: 01/01/2017
Field of study

This article is (c) Emerald Group Publishing and permission has been granted for this version to appear here (please insert the web address here). Emerald does not grant permission for this article to be further copied/distributed or hosted elsewhere without the express permission from Emerald Group Publishing Limited[EN] Purpose - Google Scholar Citations (GSC) provides an institutional affiliation link which groups together authors who belong to the same institution. The purpose of this paper is to ascertain whether this feature is able to identify and normalize all the institutions entered by the authors, and whether it is able to assign all researchers to their own institution correctly. Design/methodology/approach - Systematic queries to GSC's internal search box were performed under two different forms (institution name and institutional e-mail web domain) in September 2015. The whole Spanish academic system (82 institutions) was used as a test. Additionally, specific searches to companies (Google) and world-class universities were performed to identify and classify potential errors in the functioning of the feature. Findings - Although the affiliation tool works well for most institutions, it is unable to detect all existing institutions in the database, and it is not always able to create a unique standardized entry for each institution. Additionally, it also fails to group all the authors who belong to the same institution. A wide variety of errors have been identified and classified. Research limitations/implications - Even though the analyzed sample is good enough to empirically answer the research questions initially proposed, a more comprehensive study should be performed to calibrate the real volume of the errors. Practical implications - The discovered affiliation link errors prevent institutions from being able to access the profiles of all their respective authors using the institutions lists offered by GSC. Additionally, it introduces a shortcoming in the navigation features of Google Scholar which may impair web user experience. Social implications - Some institutions (mainly universities) are under-represented in the affiliation feature provided by GSC. This fact might jeopardize the visibility of institutions as well as the use of this feature in bibliometric or webometric analyses. Originality/value - This work proves inconsistencies in the affiliation feature provided by GSC. A whole national university system is systematically analyzed and several queries have been used to reveal errors in its functioning. The completeness of the errors identified and the empirical data examined are the most exhaustive to date regarding this topic. Finally, some recommendations about how to correctly fill in the affiliation data (both for authors and institutions) and how to improve this feature are provided as well.Orduña Malea, E.; Ayllón, JM.; Martín-Martín, A.; Delgado-López-Cózar, E. (2017). The lost academic home: institutional affiliation links in Google Scholar Citations. Online Information Review. 41(6):762-781. doi:10.1108/OIR-10-2016-0302S76278141

arXiv.org e-Print Archive

Crossref

RiuNet

The coverage of Microsoft Academic: Analyzing the publication output of a university

Author: Braendle Martin P.
Hug Sven E.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/09/2017
Field of study

This is the first detailed study on the coverage of Microsoft Academic (MA). Based on the complete and verified publication list of a university, the coverage of MA was assessed and compared with two benchmark databases, Scopus and Web of Science (WoS), on the level of individual publications. Citation counts were analyzed, and issues related to data retrieval and data quality were examined. A Perl script was written to retrieve metadata from MA based on publication titles. The script is freely available on GitHub. We find that MA covers journal articles, working papers, and conference items to a substantial extent and indexes more document types than the benchmark databases (e.g., working papers, dissertations). MA clearly surpasses Scopus and WoS in covering book-related document types and conference items but falls slightly behind Scopus in journal articles. The coverage of MA is favorable for evaluative bibliometrics in most research fields, including economics/business, computer/information sciences, and mathematics. However, MA shows biases similar to Scopus and WoS with regard to the coverage of the humanities, non-English publications, and open-access publications. Rank correlations of citation counts are high between MA and the benchmark databases. We find that the publication year is correct for 89.5% of all publications and the number of authors is correct for 95.1% of the journal articles. Given the fast and ongoing development of MA, we conclude that MA is on the verge of becoming a bibliometric superpower. However, comprehensive studies on the quality of MA metadata are still lacking

arXiv.org e-Print Archive

Repository for Publications and Research Data

Crossref

A Survey of Graph Pre-processing Methods: From Algorithmic to Hardware Perspectives

Author: Dong Mengyao
Fan Dongrui
Liu Xin
Lv Zhengyang
Sun Ninghui
Yan Mingyu
Ye Xiaochun
Publication venue
Publication date: 14/09/2023
Field of study

Graph-related applications have experienced significant growth in academia and industry, driven by the powerful representation capabilities of graph. However, efficiently executing these applications faces various challenges, such as load imbalance, random memory access, etc. To address these challenges, researchers have proposed various acceleration systems, including software frameworks and hardware accelerators, all of which incorporate graph pre-processing (GPP). GPP serves as a preparatory step before the formal execution of applications, involving techniques such as sampling, reorder, etc. However, GPP execution often remains overlooked, as the primary focus is directed towards enhancing graph applications themselves. This oversight is concerning, especially considering the explosive growth of real-world graph data, where GPP becomes essential and even dominates system running overhead. Furthermore, GPP methods exhibit significant variations across devices and applications due to high customization. Unfortunately, no comprehensive work systematically summarizes GPP. To address this gap and foster a better understanding of GPP, we present a comprehensive survey dedicated to this area. We propose a double-level taxonomy of GPP, considering both algorithmic and hardware perspectives. Through listing relavent works, we illustrate our taxonomy and conduct a thorough analysis and summary of diverse GPP techniques. Lastly, we discuss challenges in GPP and potential future directions

arXiv.org e-Print Archive

Can Microsoft Academic help to assess the citation impact of academic books?

Author: Kousha Kayvan
Thelwall Mike
Publication venue: 'Elsevier BV'
Publication date: 01/08/2018
Field of study

Despite recent evidence that Microsoft Academic is an extensive source of citation counts for journal articles, it is not known if the same is true for academic books. This paper fills this gap by comparing citations to 16,463 books from 2013 to 2016 in the Book Citation Index (BKCI) against automatically extracted citations from Microsoft Academic and Google Books in 17 fields. About 60% of the BKCI books had records in Microsoft Academic, varying by year and field. Citation counts from Microsoft Academic were 1.5 to 3.6 times higher than from BKCI in nine subject areas across all years for books indexed by both. Microsoft Academic found more citations than BKCI because it indexes more scholarly publications and combines citations to different editions and chapters. In contrast, BKCI only found more citations than Microsoft Academic for books in three fields from 2013-2014. Microsoft Academic also found more citations than Google Books in six fields for all years. Thus, Microsoft Academic may be a useful source for the impact assessment of books when comprehensive coverage is not essential

arXiv.org e-Print Archive

Wolverhampton Intellectual Repository and E-theses