5,277 research outputs found

    Weakly Supervised Cross-Lingual Named Entity Recognition via Effective Annotation and Representation Projection

    Full text link
    The state-of-the-art named entity recognition (NER) systems are supervised machine learning models that require large amounts of manually annotated data to achieve high accuracy. However, annotating NER data by human is expensive and time-consuming, and can be quite difficult for a new language. In this paper, we present two weakly supervised approaches for cross-lingual NER with no human annotation in a target language. The first approach is to create automatically labeled NER data for a target language via annotation projection on comparable corpora, where we develop a heuristic scheme that effectively selects good-quality projection-labeled data from noisy data. The second approach is to project distributed representations of words (word embeddings) from a target language to a source language, so that the source-language NER system can be applied to the target language without re-training. We also design two co-decoding schemes that effectively combine the outputs of the two projection-based approaches. We evaluate the performance of the proposed approaches on both in-house and open NER data for several target languages. The results show that the combined systems outperform three other weakly supervised approaches on the CoNLL data.Comment: 11 pages, The 55th Annual Meeting of the Association for Computational Linguistics (ACL), 201

    Learning Character-level Compositionality with Visual Features

    Full text link
    Previous work has modeled the compositionality of words by creating character-level models of meaning, reducing problems of sparsity for rare words. However, in many writing systems compositionality has an effect even on the character-level: the meaning of a character is derived by the sum of its parts. In this paper, we model this effect by creating embeddings for characters based on their visual characteristics, creating an image for the character and running it through a convolutional neural network to produce a visual character embedding. Experiments on a text classification task demonstrate that such model allows for better processing of instances with rare characters in languages such as Chinese, Japanese, and Korean. Additionally, qualitative analyses demonstrate that our proposed model learns to focus on the parts of characters that carry semantic content, resulting in embeddings that are coherent in visual space.Comment: Accepted to ACL 201

    A Syllable-based Technique for Word Embeddings of Korean Words

    Full text link
    Word embedding has become a fundamental component to many NLP tasks such as named entity recognition and machine translation. However, popular models that learn such embeddings are unaware of the morphology of words, so it is not directly applicable to highly agglutinative languages such as Korean. We propose a syllable-based learning model for Korean using a convolutional neural network, in which word representation is composed of trained syllable vectors. Our model successfully produces morphologically meaningful representation of Korean words compared to the original Skip-gram embeddings. The results also show that it is quite robust to the Out-of-Vocabulary problem.Comment: 5 pages, 3 figures, 1 table. Accepted for EMNLP 2017 Workshop - The 1st Workshop on Subword and Character level models in NLP (SCLeM

    A Sub-Character Architecture for Korean Language Processing

    Full text link
    We introduce a novel sub-character architecture that exploits a unique compositional structure of the Korean language. Our method decomposes each character into a small set of primitive phonetic units called jamo letters from which character- and word-level representations are induced. The jamo letters divulge syntactic and semantic information that is difficult to access with conventional character-level units. They greatly alleviate the data sparsity problem, reducing the observation space to 1.6% of the original while increasing accuracy in our experiments. We apply our architecture to dependency parsing and achieve dramatic improvement over strong lexical baselines.Comment: EMNLP 201

    ๊ฐœ์ฒด๋ช… ์ธ์‹์„ ์œ„ํ•œ ์กฐ์ •ํ•˜๋Š” ํ‘œ์‹œ๋ฒ•์„ ๊ณ ๋ คํ•˜๋Š” ๋‰ด๋Ÿด ๋ชจ๋ธ

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2019. 2. ๊น€ํƒœํ™˜.๊ฐœ์ฒด๋ช… ์ธ์‹ (NER) ์€ ์ž์—ฐ์–ธ์–ด์ฒ˜๋ฆฌ ์ž„๋ฌด๋“ค ์ค‘ ์ค‘์š”ํ•œ ์ž„๋ฌด์ž…๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ์— ๋Œ€ํ•ด ๊ธฐ์กด ๊ธฐ์ˆ ์€ ์–‘๋ฐฉํ–ฅ ์ˆœํ™˜์‹ ๊ฒฝ๋ง (BiRNN) ๊ณผ ์กฐ๊ฑด๋ถ€ ๋ฌด์ž‘ ์œ„์žฅ (CRF) ๋ฅผ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ๊ธฐ๊ณ„๋ฒˆ์—ญ ๋ถ„์•ผ์—์„œ ๋‚˜์˜จ attention์ด๋ž€ ์ปจ์…‰ํŠธ์—๊ฒŒ์„œ ์˜๊ฐ์„ ๋ฐ›์œผ๋ฉฐ ๋ชจ๋ธ์„ ์ด๋ฃจ์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ํŠธ๋ ˆ์ด๋‹ ํ•  ๋•Œ ๋™์ ์œผ๋กœ ํ•œ ๋‹จ์–ด์˜ character-level ํ‘œ์‹œ๋ฒ•๊ณผ ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ์˜ ์›จ์ดํŠธ๋“ค์„ ๊ฒฐ์ •ํ•˜๋ฏ€๋กœ ๋ชจ๋ธ์˜ ํšจ๊ณผ๋ฅผ ์ฆ๊ฐ€์‹œํ‚ต๋‹ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ๋‹ค์–ธ์–ด ๋ฐ์ดํ„ฐ์…‹ (์˜์–ด, ์ŠคํŽ˜์ธ์–ด, ๋„ค๋œ๋ž€๋“œ์–ด) ์—์„œ ์‹คํ—˜์„ ์ง„ํ–‰ํ•˜๊ณ  F1 ์ ์ˆ˜์˜ ๋น„๊ต๋ฅผ ํ†ตํ•ด์„œ ๋‹ค๋ฅธ ์ตœ์‹  ์—ฐ๊ตฌ๋ณด๋‹ค ์ •ํ™•๋„๊ฐ€ ๋†’์•„์กŒ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ, ๋…ผ๋ฌธ์€ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ ๋ฐฐ์น˜ ๋ฐฉ์•ˆ์„ ๋ถ„์„ํ•ด์„œ hidden layer์ˆ˜์™€ ๋‹จ์–ด ์ž„๋ฒ ๋”ฉ์ด ์ด ๋ชจ๋ธ์—๊ฒŒ ์ฃผ๋Š” ์˜ํ–ฅ, ๋ชจ๋ธ์˜ ์‹คํ–‰ ์‹œ๊ฐ„๊ณผ ํšจ์œจ๋„ ํ† ๋ก ํ–ˆ์Šต๋‹ˆ๋‹ค.Sequence tagging is an important task in Natural Language Processing (NLP), in which the Named Entity Recognition (NER) is the key issue. So far the most widely adopted model for NER in NLP is that of combining the neural network of bidirectional long short-term memory (BiLSTM) and the statistical sequence prediction method of Conditional Random Field (CRF). In this work, we improve the prediction accuracy of the BiLSTM model by supporting an aligned character and word-level representation mechanism. We have performed experiments on multilingual (English, Spanish and Dutch) datasets and confirmed that our proposed model out-performed the existing state-of-the-art models.1 Introduction 1.1 Study Background 1.2 Purpose of Research 2 The Proposed Model 2.1 Character-level BiLSTM 2.2 Attention Mechanism 2. 2.1 The concept of attention 2.2.2 Word embedding 2.2.3 Our application 2.3 Word-level BiLSTM-CRF 2.3.1 LSTM with Conditional Random Field 2.3.2 Highway layer 3 Experiment 3.1 datasets 3.2 Training 3.3 Performance 3.3.1 Evaluation criterion 3.3.2 NER results 3.3.3 Other results 4 ConclusionMaste
    • โ€ฆ
    corecore