Search CORE

30 research outputs found

Protocol for a reproducible experimental survey on biomedical sentence similarity.

Author: Alicia Lara-Clares
Ana Garcia-Serrano
Juan J Lastra-Díaz
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2021
Field of study

Measuring semantic similarity between sentences is a significant task in the fields of Natural Language Processing (NLP), Information Retrieval (IR), and biomedical text mining. For this reason, the proposal of sentence similarity methods for the biomedical domain has attracted a lot of attention in recent years. However, most sentence similarity methods and experimental results reported in the biomedical domain cannot be reproduced for multiple reasons as follows: the copying of previous results without confirmation, the lack of source code and data to replicate both methods and experiments, and the lack of a detailed definition of the experimental setup, among others. As a consequence of this reproducibility gap, the state of the problem can be neither elucidated nor new lines of research be soundly set. On the other hand, there are other significant gaps in the literature on biomedical sentence similarity as follows: (1) the evaluation of several unexplored sentence similarity methods which deserve to be studied; (2) the evaluation of an unexplored benchmark on biomedical sentence similarity, called Corpus-Transcriptional-Regulation (CTR); (3) a study on the impact of the pre-processing stage and Named Entity Recognition (NER) tools on the performance of the sentence similarity methods; and finally, (4) the lack of software and data resources for the reproducibility of methods and experiments in this line of research. Identified these open problems, this registered report introduces a detailed experimental setup, together with a categorization of the literature, to develop the largest, updated, and for the first time, reproducible experimental survey on biomedical sentence similarity. Our aforementioned experimental survey will be based on our own software replication and the evaluation of all methods being studied on the same software platform, which will be specially developed for this work, and it will become the first publicly available software library for biomedical sentence similarity. Finally, we will provide a very detailed reproducibility protocol and dataset as supplementary material to allow the exact replication of all our experiments and results

Directory of Open Access Journals

Pearson (r) and Spearman (ρ) correlation values, harmonic score (h), and harmonic average (AVG) score obtained by the LiBlock method in combination with each NER tool using the best pre-processing configuration detailed in Table 7.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

In addition, the last column (p-val) shows the p-values for the comparison of the LiBlock method with cTAKES and the remaining NER combinations.</p

FigShare

Raw and pre-processed sentence pairs obtaining the lowest and highest similarity error Esim together with their corresponding Normalized human similarity score (Human) and normalized similarity value (Method) estimated by the BioWordVecint (M26) method for the raw and pre-processed sentence pairs with the lowest (L) and highest (H) similarity error Esim.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

Raw and pre-processed sentence pairs obtaining the lowest and highest similarity error Esim together with their corresponding Normalized human similarity score (Human) and normalized similarity value (Method) estimated by the BioWordVecint (M26) method for the raw and pre-processed sentence pairs with the lowest (L) and highest (H) similarity error Esim.</p

FigShare

Pearson (r), Spearman (ρ), harmonic (h), and harmonic average (AVG) scores obtained by each sentence similarity method evaluated herein in the three biomedical sentence similarity benchmarks arranged by families.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

All reported values were obtained using the best pre-processing configurations detailed in Table 7. The results in bold show the best scores whilst results in show the best average harmonic score for each family.</p

FigShare

The statistical significance results.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

We provide a series of tables reporting the p-values for each pair of methods evaluated in this work as supplementary material. (PDF)</p

FigShare

Detailed setup for the ontology-based sentence similarity measures evaluated in this work.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

The evaluation of the methods using Rada [69], coswJ&C [46], and Cai [68] word similarity measures use a reformulation of the original path-based measures based on the new Ancestors-based Shortest-Path Length (AncSPL) algorithm [42].</p

FigShare

Pearson (r), Spearman (ρ) and harmonic (h) values obtained in our experiments from the evaluation of ontology similarity methods detailed below in the MedSTSfull [52] dataset for each NER tool.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

Pearson (r), Spearman (ρ) and harmonic (h) values obtained in our experiments from the evaluation of ontology similarity methods detailed below in the MedSTSfull [52] dataset for each NER tool.</p

FigShare

Detail of the pre-processing configurations that are evaluated in this work.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

(*) WordPieceTokenizer [91] is used only for BERT-based methods [30, 31, 34, 62, 91–94, 99].</p

FigShare

Detailed setup for the sentence similarity methods based on pre-trained character, word (WE) and sentence (SE) embedding models evaluated herein.

Author: Alicia Lara-Clares (10433024)
Ana Garcia-Serrano (6993692)
Juan J. Lastra-Díaz (6993689)
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 21/11/2022
Field of study

Detailed setup for the sentence similarity methods based on pre-trained character, word (WE) and sentence (SE) embedding models evaluated herein.</p

FigShare

Protocol for a reproducible experimental survey on biomedical sentence similarity.

Pearson (r) and Spearman (<i>ρ</i>) correlation values, harmonic score (<i>h</i>), and harmonic average (AVG) score obtained by the LiBlock method in combination with each NER tool using the best pre-processing configuration detailed in Table 7.

Pearson (r), Spearman (<i>ρ</i>), harmonic (<i>h</i>), and harmonic average (AVG) scores obtained by each sentence similarity method evaluated herein in the three biomedical sentence similarity benchmarks arranged by families.

The statistical significance results.

Detailed setup for the ontology-based sentence similarity measures evaluated in this work.

Pearson (r), Spearman (<i>ρ</i>) and harmonic (<i>h</i>) values obtained in our experiments from the evaluation of ontology similarity methods detailed below in the MedSTS<sub><i>full</i></sub> [52] dataset for each NER tool.

Detail of the pre-processing configurations that are evaluated in this work.

Detailed setup for the sentence similarity methods based on pre-trained character, word (WE) and sentence (SE) embedding models evaluated herein.