5,317 research outputs found

    Information retrieval and machine learning methods for academic expert finding

    Get PDF
    In the context of academic expert finding, this paper investigates and compares the performance of information retrieval (IR) and machine learning (ML) methods, including deep learning, to approach the problem of identifying academic figures who are experts in different domains when a potential user requests their expertise. IR-based methods construct multifaceted textual profiles for each expert by clustering information from their scientific publications. Several methods fully tailored for this problem are presented in this paper. In contrast, ML-based methods treat expert finding as a classification task, training automatic text classifiers using publications authored by experts. By comparing these approaches, we contribute to a deeper understanding of academic-expert-finding techniques and their applicability in knowledge discovery. These methods are tested with two large datasets from the biomedical field: PMSC-UGR and CORD-19. The results show how IR techniques were, in general, more robust with both datasets and more suitable than the ML-based ones, with some exceptions showing good performance.Agencia Estatal de InvestigaciĂłn | Ref. PID2019-106758GB-C31Agencia Estatal de InvestigaciĂłn | Ref. PID2020-113230RB-C22FEDER/Junta de AndalucĂ­a | Ref. A-TIC-146-UGR2

    Tracing the Pace of COVID-19 research : topic modeling and evolution

    Get PDF
    COVID-19 has been spreading rapidly around the world. With the growing attention on the deadly pandemic, discussions and research on COVID-19 are rapidly increasing to exchange latest findings with the hope to accelerate the pace of finding a cure. As a branch of information technology, artificial intelligence (AI) has greatly expedited the development of human society. In this paper, we investigate and visualize the on-going advancements of early scientific research on COVID-19 from the perspective of AI. By adopting the Latent Dirichlet Allocation (LDA) model, this paper allocates the research articles into 50 key research topics pertinent to COVID-19 according to their abstracts. We present an overview of early studies of the COVID-19 crisis at different scales including referencing/citation behavior, topic variation and their inner interactions. We also identify innovative papers that are regarded as the cornerstones in the development of COVID-19 research. The results unveil the focus of scientific research, thereby giving deep insights into how the academic society contributes to combating the COVID-19 pandemic. © 2021 Elsevier Inc. **Please note that there are multiple authors for this article therefore only the name of the first 5 including Federation University Australia affiliate “Jing Ren and Feng Xia" is provided in this record*

    Hybrid of K-Means and partitioning around medoids for predicting COVID-19 cases: Iraq case study

    Get PDF
    COVID-19 was discovered near the end of 2019 in Wuhan, China. In a short period, the virus had spread throughout the entire world. One of the primary concerns of managers and decision-makers in all types of hospitals nowadays is to implement detection plans for status of patient (Negative, Positive) in order to provide enough care at the proper moment. To reduce a pandemic of COVID-19, improving health care quality could be advantageous. Making clusters of patients with similar features and symptoms supplies an overview of health quality given to similar patients. In the scope of medical machine learning, the K-means and Partitioning Around Medoids (PAM) clustering algorithms are usually used to produce clusters depend on similarity and to detect helpful patterns from sizes of data. In this paper, we proposed a hybrid algorithm of K-Means and Partitioning Around Medoids (PAM) called K-MP to take benefits of both PAM and K-Means to construct an efficient model for predicting patient status. The suggested model for the real dataset was collected from 400 patients in the many Iraqi clinics using a questionnaire. We evaluated the proposed K-MP by using true negative rate, balance accuracy, precision, accuracy, recall, mean absolute error, F1 score, and root mean square error. From these performance measures, we found that K-MP is more efficient in discovering patient status comparing to K-Means and PAM

    ASA 2021 Statistics and Information Systems for Policy Evaluation

    Get PDF
    This book includes 40 peer-reviewed short papers submitted to the Scientific Conference titled Statistics and Information Systems for Policy Evaluation, aimed at promoting new statistical methods and applications for the evaluation of policies and organized by the Association for Applied Statistics (ASA) and the Dept. of Statistics, Computer Science, Applications DiSIA “G. Parenti” of the University of Florence, jointly with the partners AICQ (Italian Association for Quality Culture), AICQ-CN (Italian Association for Quality Culture North and Centre of Italy), AISS (Italian Academy for Six Sigma), ASSIRM (Italian Association for Marketing, Social and Opinion Research), Comune di Firenze, the SIS – Italian Statistical Society, Regione Toscana and Valmon – Evaluation & Monitoring

    The ORKG R Package and Its Use in Data Science

    Get PDF
    Research infrastructures and services provide access to (meta)data via user interfaces and APIs. The more advanced services also support access through (Python, R, etc.) packages that users can use in computational environments. For scientific information as a particular kind of research data, the Open Research Knowledge Graph (ORKG) is an example of an advanced service that also supports accessing data from Python scripts. Since many research communities use R as the statistical language of choice, we have developed the ORKG R package to support accessing and processing ORKG data directly from R scripts. Inspired by the Python library, the ORKG R package supports a comparable set of features through a similar programmatic interface. Having developed the ORKG R package, we demonstrate its use in various applications grounded in life science and soil science research fields. As an additional key contribution of this work, we show how the ORKG R package can be used in combination with ORKG templates to support the pre-publication production and publication of machine-readable scientific information, during the data analysis phase of the research life cycle and directly in the scripts that produce scientific information. This new mode of machine-readable scientific information production complements the post-publication Crowdsourcing-based manual and NLP-based automated approaches with the major advantages of unmatched high accuracy and fine granularity

    Text mining of biomedical literature: discovering new knowledge

    Get PDF
    Biomedical literature is increasing day by day. The present scenario shows that the volume of literature regarding “coronavirus” has expanded at a high rate. In this study, text mining technique has been employed to discover something new from the published literature. The main objectives of this study are to show the growth of literature (Jan-Jun, 2020), extract document section, identify latent topics, find the most frequent word, represent the bag of words, and the hierarchical clustering. We have collected 16500 documents from PubMed. This study finds most number of documents (11499) belong to May and June. We explore “betacoronavirus” as the leading document section (3837); “covid” (29890) as the most frequent word in the abstracts; and positive-negative weights of topics. Further, we measure the term frequency (TF) of a document title in the bag of words model. Then we compute a hierarchical clustering of document titles. It reveals that the lowest distance the selected cluster (C133) is 0.30. We also have made a discussion over future prospects and mentioned that this paper can be useful to researchers and library professionals for knowledge management

    The Climate Emergency Demands a New Kind of History

    Get PDF
    How shall we judge the element of practicality or urgency for scholars working in the era of the 2030 deadline for action on climate change? This essay surveys the reaction to climate change by scholars who work with data, using the philosopher Stephen Gardiner’s conceit of “corrupt institutions” to organize the approaches according to an index of pragmatic orientation. This survey will lead to the identification of some challenges for those seeking to engage the climate deadline with data, especially work making climate data more transparent, text mining to identify aspects of corruption and reform within contemporary institutions, and building infrastructure for citizen participation
    • …
    corecore