18 research outputs found
Comparative Analysis of Contextual Relation Extraction based on Deep Learning Models
Contextual Relation Extraction (CRE) is mainly used for constructing a
knowledge graph with a help of ontology. It performs various tasks such as
semantic search, query answering, and textual entailment. Relation extraction
identifies the entities from raw texts and the relations among them. An
efficient and accurate CRE system is essential for creating domain knowledge in
the biomedical industry. Existing Machine Learning and Natural Language
Processing (NLP) techniques are not suitable to predict complex relations from
sentences that consist of more than two relations and unspecified entities
efficiently. In this work, deep learning techniques have been used to identify
the appropriate semantic relation based on the context from multiple sentences.
Even though various machine learning models have been used for relation
extraction, they provide better results only for binary relations, i.e.,
relations occurred exactly between the two entities in a sentence. Machine
learning models are not suited for complex sentences that consist of the words
that have various meanings. To address these issues, hybrid deep learning
models have been used to extract the relations from complex sentence
effectively. This paper explores the analysis of various deep learning models
that are used for relation extraction.Comment: This Paper Presented in the International Conference on FOSS
Approaches towards Computational Intelligence and Language TTechnolog on
February 2023, Thiruvananthapura
BioALBERT: A Simple and Effective Pre-trained Language Model for Biomedical Named Entity Recognition
In recent years, with the growing amount of biomedical documents, coupled
with advancement in natural language processing algorithms, the research on
biomedical named entity recognition (BioNER) has increased exponentially.
However, BioNER research is challenging as NER in the biomedical domain are:
(i) often restricted due to limited amount of training data, (ii) an entity can
refer to multiple types and concepts depending on its context and, (iii) heavy
reliance on acronyms that are sub-domain specific. Existing BioNER approaches
often neglect these issues and directly adopt the state-of-the-art (SOTA)
models trained in general corpora which often yields unsatisfactory results. We
propose biomedical ALBERT (A Lite Bidirectional Encoder Representations from
Transformers for Biomedical Text Mining) bioALBERT, an effective
domain-specific language model trained on large-scale biomedical corpora
designed to capture biomedical context-dependent NER. We adopted a
self-supervised loss used in ALBERT that focuses on modelling inter-sentence
coherence to better learn context-dependent representations and incorporated
parameter reduction techniques to lower memory consumption and increase the
training speed in BioNER. In our experiments, BioALBERT outperformed
comparative SOTA BioNER models on eight biomedical NER benchmark datasets with
four different entity types. We trained four different variants of BioALBERT
models which are available for the research community to be used in future
research.Comment: 7 page