INVESTIGATING IMPROVEMENTS TO MESH INDEXING

Abstract

The MEDLINE database currently comprises an extensive collection of over 35 million citations, with more than 1 million records being added each year [28]. The abundance of information available in the database presents a significant challenge in identifying and locating relevant research articles on a given search topic. This has prompted the development of various techniques and approaches aimed at improving the efficiency and effectiveness of information retrieval from the MEDLINE database. A search engine devoted to the research publications on MEDLINE is called PubMed. MeSH, or Medical Subject Headings, is a restricted vocabulary used by PubMed to categorize each article. Human annotators have been used for decades, which is not only time-consuming but also expensive. Due to its enormously complex hierarchically ordered structure, MeSH indexing is a difficult problem in the machine learning domain. We propose a model which addresses all these challenges. We propose an end-to-end model that takes MeSH description into account and combines it with a Knowledge Enhanced Mask attention model to index new research papers. We also calculated the journal correlation of each MeSH term in the MeSH hierarchy

    Similar works