The MeSH-gram Neural Network Model: Extending Word Embedding Vectors
with MeSH Concepts for UMLS Semantic Similarity and Relatedness in the
Biomedical Domain
Eliciting semantic similarity between concepts in the biomedical domain
remains a challenging task. Recent approaches founded on embedding vectors have
gained in popularity as they risen to efficiently capture semantic
relationships The underlying idea is that two words that have close meaning
gather similar contexts. In this study, we propose a new neural network model
named MeSH-gram which relies on a straighforward approach that extends the
skip-gram neural network model by considering MeSH (Medical Subject Headings)
descriptors instead words. Trained on publicly available corpus PubMed MEDLINE,
MeSH-gram is evaluated on reference standards manually annotated for semantic
similarity. MeSH-gram is first compared to skip-gram with vectors of size 300
and at several windows contexts. A deeper comparison is performed with tewenty
existing models. All the obtained results of Spearman's rank correlations
between human scores and computed similarities show that MeSH-gram outperforms
the skip-gram model, and is comparable to the best methods but that need more
computation and external resources.Comment: 6 pages, 2 table