8,306 research outputs found
BERT Based Clinical Knowledge Extraction for Biomedical Knowledge Graph Construction and Analysis
Background : Knowledge is evolving over time, often as a result of new
discoveries or changes in the adopted methods of reasoning. Also, new facts or
evidence may become available, leading to new understandings of complex
phenomena. This is particularly true in the biomedical field, where scientists
and physicians are constantly striving to find new methods of diagnosis,
treatment and eventually cure. Knowledge Graphs (KGs) offer a real way of
organizing and retrieving the massive and growing amount of biomedical
knowledge.
Objective : We propose an end-to-end approach for knowledge extraction and
analysis from biomedical clinical notes using the Bidirectional Encoder
Representations from Transformers (BERT) model and Conditional Random Field
(CRF) layer.
Methods : The approach is based on knowledge graphs, which can effectively
process abstract biomedical concepts such as relationships and interactions
between medical entities. Besides offering an intuitive way to visualize these
concepts, KGs can solve more complex knowledge retrieval problems by
simplifying them into simpler representations or by transforming the problems
into representations from different perspectives. We created a biomedical
Knowledge Graph using using Natural Language Processing models for named entity
recognition and relation extraction. The generated biomedical knowledge graphs
(KGs) are then used for question answering.
Results : The proposed framework can successfully extract relevant structured
information with high accuracy (90.7% for Named-entity recognition (NER), 88%
for relation extraction (RE)), according to experimental findings based on
real-world 505 patient biomedical unstructured clinical notes.
Conclusions : In this paper, we propose a novel end-to-end system for the
construction of a biomedical knowledge graph from clinical textual using a
variation of BERT models
Text Classification
There is an abundance of text data in this world but most of it is raw. We need to extract information from this data to make use of it. One way to extract this information from raw text is to apply informative labels drawn from a pre-defined fixed set i.e. Text Classification. In this thesis, we focus on the general problem of text classification, and work towards solving challenges associated to binary/multi-class/multi-label classification. More specifically, we deal with the problem of (i) Zero-shot labels during testing; (ii) Active learning for text screening; (iii) Multi-label classification under low supervision; (iv) Structured label space; (v) Classifying pairs of words in raw text i.e. Relation Extraction. For (i), we use a zero-shot classification model that utilizes independently learned semantic embeddings. Regarding (ii), we propose a novel active learning algorithm that reduces problem of bias in naive active learning algorithms. For (iii), we propose neural candidate-selector architecture that starts from a set of high-recall candidate labels to obtain high-precision predictions. In the case of (iv), we proposed an attention based neural tree decoder that recursively decodes an abstract into the ontology tree. For (v), we propose using second-order relations that are derived by explicitly connecting pairs of words via context token(s) for improved relation extraction. We use a wide variety of both traditional and deep machine learning tools. More specifically, we used traditional machine learning models like multi-valued linear regression and logistic regression for (i, ii), deep convolutional neural networks for (iii), recurrent neural networks for (iv) and transformer networks for (v)
- …