28,960 research outputs found
The MeSH-gram Neural Network Model: Extending Word Embedding Vectors with MeSH Concepts for UMLS Semantic Similarity and Relatedness in the Biomedical Domain
Eliciting semantic similarity between concepts in the biomedical domain
remains a challenging task. Recent approaches founded on embedding vectors have
gained in popularity as they risen to efficiently capture semantic
relationships The underlying idea is that two words that have close meaning
gather similar contexts. In this study, we propose a new neural network model
named MeSH-gram which relies on a straighforward approach that extends the
skip-gram neural network model by considering MeSH (Medical Subject Headings)
descriptors instead words. Trained on publicly available corpus PubMed MEDLINE,
MeSH-gram is evaluated on reference standards manually annotated for semantic
similarity. MeSH-gram is first compared to skip-gram with vectors of size 300
and at several windows contexts. A deeper comparison is performed with tewenty
existing models. All the obtained results of Spearman's rank correlations
between human scores and computed similarities show that MeSH-gram outperforms
the skip-gram model, and is comparable to the best methods but that need more
computation and external resources.Comment: 6 pages, 2 table
Neurocognitive Informatics Manifesto.
Informatics studies all aspects of the structure of natural and artificial information systems. Theoretical and abstract approaches to information have made great advances, but human information processing is still unmatched in many areas, including information management, representation and understanding. Neurocognitive informatics is a new, emerging field that should help to improve the matching of artificial and natural systems, and inspire better computational algorithms to solve problems that are still beyond the reach of machines. In this position paper examples of neurocognitive inspirations and promising directions in this area are given
Ranking Significant Discrepancies in Clinical Reports
Medical errors are a major public health concern and a leading cause of death
worldwide. Many healthcare centers and hospitals use reporting systems where
medical practitioners write a preliminary medical report and the report is
later reviewed, revised, and finalized by a more experienced physician. The
revisions range from stylistic to corrections of critical errors or
misinterpretations of the case. Due to the large quantity of reports written
daily, it is often difficult to manually and thoroughly review all the
finalized reports to find such errors and learn from them. To address this
challenge, we propose a novel ranking approach, consisting of textual and
ontological overlaps between the preliminary and final versions of reports. The
approach learns to rank the reports based on the degree of discrepancy between
the versions. This allows medical practitioners to easily identify and learn
from the reports in which their interpretation most substantially differed from
that of the attending physician (who finalized the report). This is a crucial
step towards uncovering potential errors and helping medical practitioners to
learn from such errors, thus improving patient-care in the long run. We
evaluate our model on a dataset of radiology reports and show that our approach
outperforms both previously-proposed approaches and more recent language models
by 4.5% to 15.4%.Comment: ECIR 2020 (short
A Deep Network Model for Paraphrase Detection in Short Text Messages
This paper is concerned with paraphrase detection. The ability to detect
similar sentences written in natural language is crucial for several
applications, such as text mining, text summarization, plagiarism detection,
authorship authentication and question answering. Given two sentences, the
objective is to detect whether they are semantically identical. An important
insight from this work is that existing paraphrase systems perform well when
applied on clean texts, but they do not necessarily deliver good performance
against noisy texts. Challenges with paraphrase detection on user generated
short texts, such as Twitter, include language irregularity and noise. To cope
with these challenges, we propose a novel deep neural network-based approach
that relies on coarse-grained sentence modeling using a convolutional neural
network and a long short-term memory model, combined with a specific
fine-grained word-level similarity matching model. Our experimental results
show that the proposed approach outperforms existing state-of-the-art
approaches on user-generated noisy social media data, such as Twitter texts,
and achieves highly competitive performance on a cleaner corpus
- …