Search CORE

12 research outputs found

Identification of Indonesian Authors Using Deep Neural Networks

Author: Afrina Mira
Darmawahyuni Annisa
Fachrurrozi Muhammad
Fahreza Irvan
Firdaus Firdaus
Lestari Suci Dwi
Nurmaini Siti
Putra Bayu Wijaya
Rachmatullah Muhammad Naufal
Sapitri Ade Iriani
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 01/02/2022
Field of study

Author Name Disambiguation (AND) is a problem that occurs when a set of publications contains ambiguous names of authors, i.e. the same author may appear with different names (synonyms) in other published papers, or author (authors) who may be different who may have the same name (homonym). In this final project, we will design a model with a Deep Neural Network (DNN) classifier. The dataset used in this final project uses primary data sourced from the Scopus website. This research focuses on integrating data from Indonesian authors. Parameters accuracy, sensitivity and precision are standard benchmarks to determine the performance of the method used to solve AND problems. The best DNN classification model achieves 99.9936% Accuracy, 93.1433% Sensitivity, 94.3733% Precision. Then for the highest performance measurement, the case of Non Synonym-Homonym (SH) has 99.9967% Accuracy, 96.7388% Sensitivity, and 97.5102% Precision

ComEngApp-Journal

Computer Engineering and Applications Journal (ComEngApp, Universitas Sriwijaya)

Deep Neural Network Structure to Improve Individual Performance based Author Classification

Author: Afrina Mira
Anshori Muhammad
Firdaus Firdaus
Nurmaini Siti
Raflesia Sarifah Putri
Zarkasi Ahmad
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 01/02/2019
Field of study

This paper proposed an improved method for author name disambiguation problem, both homonym and synonym. The data prepared is the distance data of each pair of author’s attributes, Levenshtein distance are used. Using Deep Neural Networks, we found large gains on performance. The result shows that level of accuracy is 99.6% with a low number of hidden layer

ComEngApp-Journal

Directory of Open Access Journals

Computer Engineering and Applications Journal (ComEngApp, Universitas Sriwijaya)

Author identification in bibliographic data using deep neural networks

Author: Darmawahyuni Annisa
Firdaus Firdaus
Juliano Andre Herviant
Malik Reza Firsandaya
Nugraha Tio Artha
Nurmaini Siti
Putra Varindo Ockta Keneddi
Rachmatullah Muhammad Naufal
Publication venue: 'Universitas Ahmad Dahlan'
Publication date: 01/06/2021
Field of study

Author name disambiguation (AND) is a challenging task for scholars who mine bibliographic information for scientific knowledge. A constructive approach for resolving name ambiguity is to use computer algorithms to identify author names. Some algorithm-based disambiguation methods have been developed by computer and data scientists. Among them, supervised machine learning has been stated to produce decent to very accurate disambiguation results. This paper presents a combination of principal component analysis (PCA) as a feature reduction and deep neural networks (DNNs), as a supervised algorithm for classifying AND problems. The raw data is grouped into four classes, i.e., synonyms, homonyms, homonyms-synonyms, and non-homonyms-synonyms classification. We have taken into account several hyperparameters tuning, such as learning rate, batch size, number of the neuron and hidden units, and analyzed their impact on the accuracy of results. To the best of our knowledge, there are no previous studies with such a scheme. The proposed DNNs are validated with other ML techniques such as Naïve Bayes, random forest (RF), and support vector machine (SVM) to produce a good classifier. By exploring the result in all data, our proposed DNNs classifier has an outperformed other ML technique, with accuracy, precision, recall, and F1-score, which is 99.98%, 97.98%, 97.86%, and 99.99%, respectively. In the future, this approach can be easily extended to any dataset and any bibliographic records provider

Journal of Education and Learning (EduLearn)

TELKOMNIKA (Telecommunication Computing Electronics and Control)

UAD Journal Management System

Harnessing Historical Corrections to build Test Collections for Named Entity Disambiguation

Author: AA Ferreira
AF Santana
C Sun
D Shin
F Momeni
F Reitz
I Kang
M Levin
M Müller
X Fan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/08/2018
Field of study

Matching mentions of persons to the actual persons (the name disambiguation problem) is central for several digital library applications. Scientists have been working on algorithms to create this matching for decades without finding a universal solution. One problem is that test collections for this problem are often small and specific to a certain collection. In this work, we present an approach that can create large test collections from historical metadata with minimal extra cost. We apply this approach to the DBLP collection to generate two freely available test collections. One collection focuses on the properties of defects and one on the evaluation of disambiguation algorithms.Comment: Preprint of a paper accepted at TPDL 201

arXiv.org e-Print Archive

Crossref

Health warning: might contain multiple personalities - the problem of homonyms in Thomson Reuters Essential Science Indicators

Author: A Strotmann
Anne-Wil Harzing
AW Harzing
AW Harzing
AW Harzing
D Shin
H Wu
J Qiu
J Zhu
S Heeffer
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/08/2015
Field of study

Author name ambiguity is a crucial problem in any type of bibliometric analysis. It arises when several authors share the same name, but also when one author expresses their name in different ways. This article focuses on the former, also called the “namesake” problem. In particular, we assess the extent to which this compromises the Thomson Reuters Essential Science Indicators (ESI) ranking of the top 1% most cited authors worldwide. We show that three demographic characteristics that should be unrelated to research productivity – name origin, uniqueness of one’s family name, and the number of initials used in publishing – in fact have a very strong influence on it. In contrast to what could be expected from Web of Science publication data, researchers with Asian names – and in particular Chinese and Korean names – appear to be far more productive than researchers with Western names. Furthermore, for any country, academics with common names and fewer initials also appear to be more productive than their more uniquely named counterparts. However, this appearance of high productivity is caused purely by the fact that these “academic superstars” are in fact composites of many individual academics with the same name. We thus argue that it is high time that Thomson Reuters starts taking name disambiguation in general, and non-Anglophone names in particular, more seriously

Crossref

Middlesex University Research Repository