426 research outputs found
Learning Relation Prototype from Unlabeled Texts for Long-tail Relation Extraction
Relation Extraction (RE) is a vital step to complete Knowledge Graph (KG) by
extracting entity relations from texts.However, it usually suffers from the
long-tail issue. The training data mainly concentrates on a few types of
relations, leading to the lackof sufficient annotations for the remaining types
of relations. In this paper, we propose a general approach to learn relation
prototypesfrom unlabeled texts, to facilitate the long-tail relation extraction
by transferring knowledge from the relation types with sufficient trainingdata.
We learn relation prototypes as an implicit factor between entities, which
reflects the meanings of relations as well as theirproximities for transfer
learning. Specifically, we construct a co-occurrence graph from texts, and
capture both first-order andsecond-order entity proximities for embedding
learning. Based on this, we further optimize the distance from entity pairs
tocorresponding prototypes, which can be easily adapted to almost arbitrary RE
frameworks. Thus, the learning of infrequent or evenunseen relation types will
benefit from semantically proximate relations through pairs of entities and
large-scale textual information.We have conducted extensive experiments on two
publicly available datasets: New York Times and Google Distant
Supervision.Compared with eight state-of-the-art baselines, our proposed model
achieves significant improvements (4.1% F1 on average). Furtherresults on
long-tail relations demonstrate the effectiveness of the learned relation
prototypes. We further conduct an ablation study toinvestigate the impacts of
varying components, and apply it to four basic relation extraction models to
verify the generalization ability.Finally, we analyze several example cases to
give intuitive impressions as qualitative analysis. Our codes will be released
later
A Review on Human-Computer Interaction and Intelligent Robots
In the field of artificial intelligence, human–computer interaction (HCI) technology and its related intelligent robot technologies are essential and interesting contents of research. From the perspective of software algorithm and hardware system, these above-mentioned technologies study and try to build a natural HCI environment. The purpose of this research is to provide an overview of HCI and intelligent robots. This research highlights the existing technologies of listening, speaking, reading, writing, and other senses, which are widely used in human interaction. Based on these same technologies, this research introduces some intelligent robot systems and platforms. This paper also forecasts some vital challenges of researching HCI and intelligent robots. The authors hope that this work will help researchers in the field to acquire the necessary information and technologies to further conduct more advanced research
Recommended from our members
Neural approaches to discourse coherence: modeling, evaluation and application
Discourse coherence is an important aspect of text quality that refers to the way different textual units relate to each other. In this thesis, I investigate neural approaches to modeling discourse coherence. I present a multi-task neural network where the main task is to predict a document-level coherence score and the secondary task is to learn word-level syntactic features. Additionally, I examine the effect of using contextualised word representations in single-task and multi-task setups. I evaluate my models on a synthetic dataset where incoherent documents are created by shuffling the sentence order in coherent original documents. The results show the efficacy of my multi-task learning approach, particularly when enhanced with contextualised embeddings, achieving new state-of-the-art results in ranking the coherent documents higher than the incoherent ones (96.9%). Furthermore, I apply my approach to the realistic domain of people’s everyday writing, such as emails and online posts, and further demonstrate its ability to capture various degrees of coherence. In order to further investigate the linguistic properties captured by coherence models, I create two datasets that exhibit syntactic and semantic alterations. Evaluating different models on these datasets reveals their ability to capture syntactic perturbations but their inadequacy to detect semantic changes. I find that semantic alterations are instead captured by models that first build sentence representations from averaged word embeddings, then apply a set of linear transformations over input sentence pairs. Finally, I present an application for coherence models in the pedagogical domain. I first demonstrate that state of-the-art neural approaches to automated essay scoring (AES) are not robust to adversarially created, grammatical, but incoherent sequences of sentences. Accordingly, I propose a framework for integrating and jointly training a coherence model with a state-of-the-art neural AES system in order to enhance its ability to detect such adversarial input. I show that this joint framework maintains a performance comparable to the state-of-the-art AES system in predicting a holistic essay score while significantly outperforming it in adversarial detection
HAMNER: Headword Amplified Multi-span Distantly Supervised Method for Domain Specific Named Entity Recognition
To tackle Named Entity Recognition (NER) tasks, supervised methods need to
obtain sufficient cleanly annotated data, which is labor and time consuming. On
the contrary, distantly supervised methods acquire automatically annotated data
using dictionaries to alleviate this requirement. Unfortunately, dictionaries
hinder the effectiveness of distantly supervised methods for NER due to its
limited coverage, especially in specific domains. In this paper, we aim at the
limitations of the dictionary usage and mention boundary detection. We
generalize the distant supervision by extending the dictionary with headword
based non-exact matching. We apply a function to better weight the matched
entity mentions. We propose a span-level model, which classifies all the
possible spans then infers the selected spans with a proposed dynamic
programming algorithm. Experiments on all three benchmark datasets demonstrate
that our method outperforms previous state-of-the-art distantly supervised
methods.Comment: 9 pages, 2 figure
Unifying Token and Span Level Supervisions for Few-Shot Sequence Labeling
Few-shot sequence labeling aims to identify novel classes based on only a few
labeled samples. Existing methods solve the data scarcity problem mainly by
designing token-level or span-level labeling models based on metric learning.
However, these methods are only trained at a single granularity (i.e., either
token level or span level) and have some weaknesses of the corresponding
granularity. In this paper, we first unify token and span level supervisions
and propose a Consistent Dual Adaptive Prototypical (CDAP) network for few-shot
sequence labeling. CDAP contains the token-level and span-level networks,
jointly trained at different granularities. To align the outputs of two
networks, we further propose a consistent loss to enable them to learn from
each other. During the inference phase, we propose a consistent greedy
inference algorithm that first adjusts the predicted probability and then
greedily selects non-overlapping spans with maximum probability. Extensive
experiments show that our model achieves new state-of-the-art results on three
benchmark datasets.Comment: Accepted by ACM Transactions on Information System
- …