4 research outputs found
Deep Bidirectional Transformers for Relation Extraction without Supervision
We present a novel framework to deal with relation extraction tasks in cases
where there is complete lack of supervision, either in the form of gold
annotations, or relations from a knowledge base. Our approach leverages
syntactic parsing and pre-trained word embeddings to extract few but precise
relations,which are then used to annotate a larger cor-pus, in a manner
identical to distant supervision. The resulting data set is employed to fine
tune a pre-trained BERT model in order to perform relation extraction.
Empirical evaluation on four data sets from the biomedical domain shows that
our method significantly outperforms two simple baselines for unsupervised
relation extraction and, even if not using any supervision at all, achieves
slightly worse results than the state-of-the-art in three out of four data
sets. Importantly, we show that it is possible to successfully fine tune a
large pre-trained language model with noisy data, as op-posed to previous works
that rely on gold data for fine tuning
DARE: Data Augmented Relation Extraction with GPT-2
Real-world Relation Extraction (RE) tasks are challenging to deal with,
either due to limited training data or class imbalance issues. In this work, we
present Data Augmented Relation Extraction(DARE), a simple method to augment
training data by properly fine-tuning GPT-2 to generate examples for specific
relation types. The generated training data is then used in combination with
the gold dataset to train a BERT-based RE classifier. In a series of
experiments we show the advantages of our method, which leads in improvements
of up to 11 F1 score points against a strong base-line. Also, DARE achieves new
state of the art in three widely used biomedical RE datasets surpassing the
previous best results by 4.7 F1 points on average
RDSGAN: Rank-based Distant Supervision Relation Extraction with Generative Adversarial Framework
Distant supervision has been widely used for relation extraction but suffers
from noise labeling problem. Neural network models are proposed to denoise with
attention mechanism but cannot eliminate noisy data due to its non-zero
weights. Hard decision is proposed to remove wrongly-labeled instances from the
positive set though causes loss of useful information contained in removed
instances. In this paper, we propose a novel generative neural framework named
RDSGAN (Rank-based Distant Supervision GAN) which automatically generates valid
instances for distant supervision relation extraction. Our framework combines
soft attention and hard decision to learn the distribution of true positive
instances via adversarial training and selects valid instances conforming to
the distribution via rank-based distant supervision, which addresses the false
positive problem. Experimental results show the superiority of our framework
over strong baselines
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot Learning for Structured Data
Supervised machine learning has several drawbacks that make it difficult to
use in many situations. Drawbacks include: heavy reliance on massive training
data, limited generalizability and poor expressiveness of high-level semantics.
Low-shot Learning attempts to address these drawbacks. Low-shot learning allows
the model to obtain good predictive power with very little or no training data,
where structured knowledge plays a key role as a high-level semantic
representation of human. This article will review the fundamental factors of
low-shot learning technologies, with a focus on the operation of structured
knowledge under different low-shot conditions. We also introduce other
techniques relevant to low-shot learning. Finally, we point out the limitations
of low-shot learning, the prospects and gaps of industrial applications, and
future research directions.Comment: 41 pages, 280 reference