720 research outputs found
SNU_IDS at SemEval-2018 Task 12: Sentence Encoder with Contextualized Vectors for Argument Reasoning Comprehension
We present a novel neural architecture for the Argument Reasoning
Comprehension task of SemEval 2018. It is a simple neural network consisting of
three parts, collectively judging whether the logic built on a set of given
sentences (a claim, reason, and warrant) is plausible or not. The model
utilizes contextualized word vectors pre-trained on large machine translation
(MT) datasets as a form of transfer learning, which can help to mitigate the
lack of training data. Quantitative analysis shows that simply leveraging LSTMs
trained on MT datasets outperforms several baselines and non-transferred
models, achieving accuracies of about 70% on the development set and about 60%
on the test set.Comment: SemEval 201
A Syllable-based Technique for Word Embeddings of Korean Words
Word embedding has become a fundamental component to many NLP tasks such as
named entity recognition and machine translation. However, popular models that
learn such embeddings are unaware of the morphology of words, so it is not
directly applicable to highly agglutinative languages such as Korean. We
propose a syllable-based learning model for Korean using a convolutional neural
network, in which word representation is composed of trained syllable vectors.
Our model successfully produces morphologically meaningful representation of
Korean words compared to the original Skip-gram embeddings. The results also
show that it is quite robust to the Out-of-Vocabulary problem.Comment: 5 pages, 3 figures, 1 table. Accepted for EMNLP 2017 Workshop - The
1st Workshop on Subword and Character level models in NLP (SCLeM
Cache-based Query Processing for the Boolean Retrieval Model
We propose a new method of processing general Boolean queries utilizing previous query results stored in a result cache in a mediator architecture. A simple but noble normalization form is developed to describe keyword-based Boolean queries and the content of the result cache. We propose Boolean query processing algorithms based on this form of presentation that utilizes the result cache. We show that the proposed method theoretically guarantees improved performance over the conventional query processing method without using a cache
- …