Search CORE

6 research outputs found

Author Matching Classification with Anomaly Detection Approach for Bibliomethric Repository Data

Author: Nurmaini Siti
Rini Dian Palupi
Yamani Zaqqi
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 01/06/2020
Field of study

Authors name disambiguation (AND) is a complex problem in the process of identifying an author in a digital library (DL). The AND data classification process is very much determined by the grouping process and data processing techniques before entering the classifier algorithm. In general, the data pre-processing technique used is pairwise and similarity to do author matching. In a large enough data set scale, the pairwise technique used in this study is to do a combination of each attribute in the AND dataset and by defining a binary class for each author matching combination, where the unequal author is given a value of 0 and the same author is given a value of 1. The technique produces very high imbalance data where class 0 becomes 98.9% of the amount of data compared to 1.1% of class 1. The results bring up an analysis in which class 1 can be considered and processed as data anomaly of the whole data. Therefore, anomaly detection is the method chosen in this study using the Isolation Forest algorithm as its classifier. The results obtained are very satisfying in terms of accuracy which can reach 99.5%

ComEngApp-Journal

Computer Engineering and Applications Journal (ComEngApp, Universitas Sriwijaya)

Deep Neural Network Structure to Improve Individual Performance based Author Classification

Author: Afrina Mira
Anshori Muhammad
Firdaus Firdaus
Nurmaini Siti
Raflesia Sarifah Putri
Zarkasi Ahmad
Publication venue: 'Faculty of Computer Science, Sriwijaya University'
Publication date: 01/02/2019
Field of study

This paper proposed an improved method for author name disambiguation problem, both homonym and synonym. The data prepared is the distance data of each pair of author’s attributes, Levenshtein distance are used. Using Deep Neural Networks, we found large gains on performance. The result shows that level of accuracy is 99.6% with a low number of hidden layer

ComEngApp-Journal

Directory of Open Access Journals

Computer Engineering and Applications Journal (ComEngApp, Universitas Sriwijaya)

Lightly supervised acquisition of named entities and linguistic patterns for multilingual text mining

Author: Iglesias Maqueda Ana María
Martínez Fernández Paloma
Pablo-Sánchez César de
Segura-Bedmar Isabel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Named Entity Recognition and Classiﬁcation (NERC) is an important component of applications like Opinion Tracking, Information Extraction, or Question Answering. When these applications require to work in several languages, NERC becomes a bottleneck because its development requires language-speciﬁc tools and resources like lists of names or annotated corpora. This paper presents a lightly supervised system that acquires lists of names and linguistic patterns from large raw text collections in western languages and starting with only a few seeds per class selected by a human expert. Experiments have been carried out with English and Spanish news collections and with the Spanish Wikipedia. Evaluation of NE classiﬁcation on standard datasets shows that NE lists achieve high precision and reveals that contextual patterns increase recall significantly. Therefore, it would be helpful for applications where annotated NERC data are not available such as those that have to deal with several western languages or information from different domains.This researchwork has been supported by the Regional Government of Madrid under the Research Network MA2VICMR (S2009/TIC-1542), by the Spanish Ministry of Education under the project MULTIMEDICA (TIN2010-20644-C03-01) and by the Spanish Center for Industry Technological Development (CDTI, Ministry of Industry, Tourism and Trade), through the BUSCAMEDIA Project (CEN-20091026)

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Universidad Carlos III de Madrid e-Archivo

A knowledge graph embeddings based approach for author name disambiguation using literals

Author: Alam M.
Gangemi A.
Gesese G. A.
Peroni S.
Sack H.
Santini C.
Publication venue
Publication date: 01/01/2022
Field of study

Scholarly data is growing continuously containing information about the articles from a plethora of venues including conferences, journals, etc. Many initiatives have been taken to make scholarly data available in the form of Knowledge Graphs (KGs). These efforts to standardize these data and make them accessible have also led to many challenges such as exploration of scholarly articles, ambiguous authors, etc. This study more specifically targets the problem of Author Name Disambiguation (AND) on Scholarly KGs and presents a novel framework, Literally Author Name Disambiguation (LAND), which utilizes Knowledge Graph Embeddings (KGEs) using multimodal literal information generated from these KGs. This framework is based on three components: (1) multimodal KGEs, (2) a blocking procedure, and finally, (3) hierarchical Agglomerative Clustering. Extensive experiments have been conducted on two newly created KGs: (i) KG containing information from Scientometrics Journal from 1978 onwards (OC-782K), and (ii) a KG extracted from a well-known benchmark for AND provided by AMiner (AMiner-534K). The results show that our proposed architecture outperforms our baselines of 8–14% in terms of F1 score and shows competitive performances on a challenging benchmark such as AMiner. The code and the datasets are publicly available through Github (https://github.com/sntcristian/and-kge) and Zenodo (https://doi.org/10.5281/zenodo.6309855) respectively

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

A knowledge graph embeddings based approach for author name disambiguation using literals

Author: Alam Mehwish
Gangemi Aldo
Gesese Genet Asefa
Peroni Silvio
Sack Harald
Santini Cristian
Publication venue: Springer Verlag
Publication date: 01/01/2022
Field of study

arXiv.org e-Print Archive

KITopen

Archivio istituzionale della ricerca - Università di Macerata