63 research outputs found

    Statistical parsing of morphologically rich languages (SPMRL): what, how and whither

    Get PDF
    The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations

    ๊ตฌ๋ฌธ๋ก ์„ ํ™œ์šฉํ•œ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜ ๋ฌธ์žฅ ํ‘œํ˜„์˜ ํ•™์Šต ๋ฐ ๋ถ„์„

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2021.8. ๊น€ํƒœ์šฑ.๊ตฌ๋ฌธ๋ก (syntax)์€ ์–ธ์–ดํ•™์˜ ํ•œ ๊ฐˆ๋ž˜๋กœ์จ, ์ž์—ฐ์–ด ๋ฌธ์žฅ์˜ ํ˜•์„ฑ ๊ณผ์ •์— ๋‚ดํฌ๋˜์–ด ์žˆ ๋Š” ์›๋ฆฌ์™€ ๊ทธ๋กœ ์ธํ•ด ์ด‰๋ฐœ๋˜๋Š” ์—ฌ๋Ÿฌ ์–ธ์–ด์  ํ˜„์ƒ์„ ๊ทœ์ •ํ•˜๊ณ  ์ด๋ฅผ ๊ฒ€์ฆํ•˜๋Š” ์—ฐ๊ตฌ ๋ถ„์•ผ๋ฅผ ์ด์นญํ•œ๋‹ค. ๊ตฌ๋ฌธ๋ก ์€ ๋‹จ์–ด, ๊ตฌ ๋ฐ ์ ˆ๊ณผ ๊ฐ™์€ ๋ฌธ์žฅ ๋‚ด์˜ ๊ตฌ์„ฑ ์š”์†Œ๋กœ๋ถ€ํ„ฐ ํ•ด๋‹น ๋ฌธ์žฅ์˜ ์˜๋ฏธ๋ฅผ ์ ์ง„์ ์œผ๋กœ ๊ตฌ์ถ•ํ•ด ๋‚˜๊ฐ€๋Š” ๊ณผ์ •์— ๋Œ€ํ•œ ์ฒด๊ณ„์ ์ธ ์ด๋ก ์  ์ ˆ์ฐจ๋ฅผ ์ œ๊ณตํ•˜๋ฉฐ, ๋”ฐ๋ผ์„œ ์ด๋Š” ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์—์„œ ๋ฌธ์žฅ ํ‘œํ˜„ ํ•™์Šต ๋ฐ ๋ถ„์„์„ ์œ„ํ•œ ๋ฐฉ๋ฒ•๋ก ์„ ๊ตฌ์ƒํ•˜๋Š”๋ฐ ์žˆ์–ด ํ™œ์šฉ๋  ์ˆ˜ ์žˆ๋Š” ์ž ์žฌ์„ฑ์„ ์ง€๋‹ˆ๊ณ  ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜์˜ ๋ฌธ์žฅ ํ‘œํ˜„ ๋ฐฉ๋ฒ•๋ก ์„ ๊ฐœ๋ฐœํ•˜๋Š” ๋ฐ ์žˆ์–ด ๊ตฌ๋ฌธ๋ก ์„ ํ™œ์šฉํ•˜๋Š” ๋‘ ์ธก๋ฉด์— ๊ด€ํ•˜์—ฌ ๋…ผํ•œ๋‹ค. ๋จผ์ €, ์–ธ์–ดํ•™์ ์ธ ํŒŒ์Šค ํŠธ๋ฆฌ์˜ ํ˜•ํƒœ๋กœ ํ‘œํ˜„๋˜ ์–ด ์žˆ๊ฑฐ๋‚˜ ํ˜น์€ ํƒ€ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์˜ ํŒŒ๋ผ๋ฏธํ„ฐ์— ์•”์‹œ์ ์œผ๋กœ ์ €์žฅ๋˜์–ด ์žˆ๋Š” ๊ตฌ๋ฌธ๋ก ์  ์ง€์‹์„ ๋„์ž…ํ•˜์—ฌ ๋” ๋‚˜์€ ๋ฌธ์žฅ ํ‘œํ˜„์„ ๋งŒ๋“œ๋Š” ๋ณด๋‹ค ์ง์ ‘์ ์ธ ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์‹œํ•œ๋‹ค. ์ด์— ๋”ํ•˜์—ฌ, ๊ตฌ๋ฌธ๋ก ์— ๋ฐ”ํƒ•ํ•œ ๋ฌธ๋ฒ•์  ์ฒด๊ณ„๋ฅผ ์ด์šฉํ•˜์—ฌ ํ•™์Šต์ด ์™„๋ฃŒ๋œ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜ ๋ฌธ์žฅ ํ‘œํ˜„ ๋ชจ๋ธ๋“ค์˜ ์ž‘๋™ ์›๋ฆฌ๋ฅผ ๊ทœ๋ช…ํ•˜๊ณ  ์ด๋“ค์˜ ๊ฐœ์„ ์ ์„ ์ฐพ๋Š”๋ฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ ๋Š” ๋ถ„์„์  ์ ‘๊ทผ๋ฒ• ๋˜ํ•œ ์†Œ๊ฐœํ•œ๋‹ค. ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ์˜ ๋‹ค๊ฐ์ ์ธ ์‹คํ—˜๊ณผ ๊ฒ€์ฆ์„ ํ†ตํ•˜์—ฌ ๊ทœ์น™ ๋ฐ ํ†ต๊ณ„ ๊ธฐ๋ฐ˜ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์—์„œ ๊ท€์ค‘ํ•œ ์ž์›์œผ๋กœ ๊ฐ„์ฃผ๋˜์—ˆ๋˜ ๊ตฌ๋ฌธ๋ก ์ด ์‹ ๊ฒฝ๋ง ๊ธฐ๋ฐ˜์˜ ๋ชจ๋ธ์ด ๋Œ€์ค‘์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋Š” ํ˜„์žฌ์˜ ์ž์—ฐ์–ด์ฒ˜๋ฆฌ์—์„œ๋„ ๋ณด์™„์žฌ๋กœ์จ ๊ธฐ๋Šฅํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๊ตฌ๋ฌธ๋ก ์ด ๊ณ ์„ฑ๋Šฅ์˜ ๋ฌธ์žฅ ํ‘œํ˜„์„ ์œ„ํ•œ ์‹ ๊ฒฝ ๋ง ๋ชจ๋ธ ํ˜น์€ ์ด๋ฅผ ์œ„ํ•œ ํ•™์Šต ๋ฐฉ๋ฒ•๋ก ์„ ๊ฐœ๋ฐœํ•˜๋Š”๋ฐ ์žˆ์–ด ํšจ๊ณผ์ ์ธ ์ง๊ด€์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Œ์„ ์‹ค์ฆํ•˜๊ณ , ๋ฌธ์žฅ ํ‘œํ˜„ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์ด ์ž์ฒด์ ์œผ๋กœ ํŒŒ์Šค ํŠธ๋ฆฌ๋ฅผ ๋ณต์›ํ•ด๋‚ผ ์ˆ˜ ์žˆ๋Š” ๋Šฅ๋ ฅ์„ ํ‰๊ฐ€ํ•จ์œผ๋กœ์จ ๊ตฌ๋ฌธ๋ก ์„ ๋‚ด๋ถ€ ์ž‘๋™ ์ฒด๊ณ„๊ฐ€ ๋ถˆ๋ช…ํ™•ํ•œ ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ์˜ ์ž‘๋™ ์›๋ฆฌ์— ๋Œ€ํ•œ ์ดํ•ด๋„๋ฅผ ์ฆ์ง„์‹œํ‚ค๋Š” ๋ถ„์„ ๋„๊ตฌ๋กœ ํ™œ์šฉํ•œ๋‹ค.Syntax is a theory in linguistics that deals with the principles underlying the composition of sentences. As this theoretical framework provides formal instructions regarding the procedure of constructing a sentence with its constituents, it has been considered as a valuable reference in sentence representation learning, whose objective is to discover an approach of transforming a sentence into the vector that illustrates its meaning in a computationally tractable manner. This dissertation provides two particular perspectives on harmonizing syntax with neural sentence representation models, especially focusing on constituency grammar. We ๏ฌrst propose two methods for enriching the quality of sentence embeddings by exploiting the syntactic knowledge either represented as explicit parse trees or implicitly stored in neural models. Second, we regard syntactic formalism as a lens through which we reveal the inner workings of pre-trained language models which are state-of-the-art in sentence representation learning. With a series of demonstrations in practical scenarios, we show that syntax is useful even in the neural era where the models trained with huge corpora in an end-to-end manner are prevalent, functioning as either (i) a source of inductive biases that facilitate fast and e๏ฌ€ective learning of such models or (ii) an analytic tool that increases the interpretability of the black-box models.Chapter 1 Introduction 1 1.1 Dissertation Outline 5 1.2 Related Publications 6 Chapter 2 Background 8 2.1 Introduction to Syntax 8 2.2 Neural Networks for Sentence Representations 10 2.2.1 Recursive Neural Network 11 2.2.2 Transformer 12 2.2.3 Pre-trained Language Models 14 2.3 Related Literature 16 2.3.1 Sentence Representation Learning 16 2.3.2 Probing Methods for Neural NLP Models 17 2.3.3 Grammar Induction and Unsupervised Parsing 18 Chapter 3 Sentence Representation Learning with Explicit Syntactic Structure 19 3.1 Introduction 19 3.2 Related Work 21 3.3 Method 23 3.3.1 Tree-LSTM 24 3.3.2 Structure-aware Tag Representation 25 3.3.3 Leaf-LSTM 28 3.3.4 SATA Tree-LSTM 29 3.4 Experiments 31 3.4.1 General Con๏ฌgurations 31 3.4.2 Sentence Classi๏ฌcation Tasks 32 3.4.3 Natural Language Inference 35 3.5 Analysis 36 3.5.1 Ablation Study 36 3.5.2 Representation Visualization 38 3.6 Limitations and Future Work 39 3.7 Summary 40 Chapter 4 Sentence Representation Learning with Implicit Syntactic Knowledge 41 4.1 Introduction 41 4.2 Related Work 44 4.3 Method 46 4.3.1 Contrastive Learning with Self-Guidance 47 4.3.2 Learning Objective Optimization 50 4.4 Experiments 52 4.4.1 General Con๏ฌgurations 52 4.4.2 Semantic Textual Similarity Tasks 53 4.4.3 Multilingual STS Tasks 58 4.4.4 SentEval Benchmark 59 4.5 Analysis 60 4.5.1 Ablation Study 60 4.5.2 Robustness to Domain Shifts 61 4.5.3 Computational Efficiency 62 4.5.4 Representation Visualization 63 4.6 Limitations and Future Work 63 4.7 Summary 65 Chapter 5 Syntactic Analysis of Sentence Representation Models 66 5.1 Introduction 66 5.2 Related Work 68 5.3 Motivation 70 5.4 Method 72 5.4.1 CPE-PLM 72 5.4.2 Top-down CPE-PLM 73 5.4.3 Pre-trained Language Models 74 5.4.4 Distance Measure Functions 76 5.4.5 Injecting Bias into Syntactic Distances 77 5.5 Experiments 78 5.5.1 General Con๏ฌgurations 78 5.5.2 Experimental Results on PTB 80 5.5.3 Experimental Results on MNLI 83 5.6 Analysis 85 5.6.1 Performance Comparison by Layer 85 5.6.2 Estimating the Upper Limit of Distance Measure Functions 86 5.6.3 Constituency Tree Examples 88 5.7 Summary 93 Chapter 6 Multilingual Syntactic Analysis with Enhanced Techniques 94 6.1 Introduction 94 6.2 Related work 96 6.3 Method 97 6.3.1 Chart-based CPE-PLM 97 6.3.2 Top-K Ensemble for CPE-PLM 100 6.4 Experiments 100 6.4.1 General Con๏ฌgurations 100 6.4.2 Experiments on Monolingual Settings 102 6.4.3 Experiments on Multilingual Settings 103 6.5 Analysis 106 6.5.1 Factor Correlation Analysis 108 6.5.2 Visualization of Attention Heads 108 6.5.3 Recall Scores on Noun and Verb Phrases 109 6.6 Limitations and Future Work 110 6.7 Summary 111 Chapter 7 Conclusion 112 Bibliography 116 ์ดˆ๋ก 138๋ฐ•

    Representation and parsing of multiword expressions

    Get PDF
    This book consists of contributions related to the definition, representation and parsing of MWEs. These reflect current trends in the representation and processing of MWEs. They cover various categories of MWEs such as verbal, adverbial and nominal MWEs, various linguistic frameworks (e.g. tree-based and unification-based grammars), various languages including English, French, Modern Greek, Hebrew, Norwegian), and various applications (namely MWE detection, parsing, automatic translation) using both symbolic and statistical approaches

    Current trends

    Get PDF
    Deep parsing is the fundamental process aiming at the representation of the syntactic structure of phrases and sentences. In the traditional methodology this process is based on lexicons and grammars representing roughly properties of words and interactions of words and structures in sentences. Several linguistic frameworks, such as Headdriven Phrase Structure Grammar (HPSG), Lexical Functional Grammar (LFG), Tree Adjoining Grammar (TAG), Combinatory Categorial Grammar (CCG), etc., offer different structures and combining operations for building grammar rules. These already contain mechanisms for expressing properties of Multiword Expressions (MWE), which, however, need improvement in how they account for idiosyncrasies of MWEs on the one hand and their similarities to regular structures on the other hand. This collaborative book constitutes a survey on various attempts at representing and parsing MWEs in the context of linguistic theories and applications

    Empirical studies on word representations

    Get PDF

    Empirical studies on word representations

    Get PDF
    One of the most fundamental tasks in natural language processing is representing words with mathematical objects (such as vectors). The word representations, which are most often estimated from data, allow capturing the meaning of words. They enable comparing words according to their semantic similarity, and have been shown to work extremely well when included in complex real-world applications. A large part of our work deals with ways of estimating word representations directly from large quantities of text. Our methods exploit the idea that words which occur in similar contexts have a similar meaning. How we define the context is an important focus of our thesis. The context can consist of a number of words to the left and to the right of the word in question, but, as we show, obtaining context words via syntactic links (such as the link between the verb and its subject) often works better. We furthermore investigate word representations that accurately capture multiple meanings of a single word. We show that translation of a word in context contains information that can be used to disambiguate the meaning of that word

    Variation and learnability in constraints on A-bar movement

    Get PDF
    A classic problem in linguistics is explaining how learners come to know so much about their native languages, despite receiving limited and noisy input. This learning problem becomes especially acute when the linguistic properties in question are obscure and show subtle variation across languages. Cross-linguistic variation means that learners must identify the appropriate points of variation for their language, even though the direct evidence that they need is often hard to detect or even non-existent. This dissertation presents two case studies on constraints in A-bar movement. Because constraints are by nature abstract and difficult to observe directly, a classic solution to the learning problem posed by constraints claims that knowledge of these abstract or negative linguistic properties is innate. However, a number of these constraints show cross-linguistic variation, raising questions about how they are represented and how linguistic experience might (or might not) shape linguistic knowledge. The first case study, discussed in chapters 2 and 3, involves cross-linguistic variation in the constraint that governs A-bar movement from relative clauses: some, but not all, languages allow A-bar movement from relative clauses under exceptional circumstances. I argue that these โ€œporousโ€ relative clauses that permit A-bar movement can be distinguished by a property that I call โ€œtense dependence,โ€ and discuss how this tense property might be formally related to A-bar movement. I show that this particular kind of variation presents a learning problem: in languages like English and Mandarin Chinese, learners have little direct positive evidence that such A-bar movement is possible. Using tense dependence, I propose that learners might circumvent this absence of direct evidence by relying on A-bar movement from a superficially unrelated structure: non-finite purposive clauses. The second case study, discussed in chapter 4, involves bridge verbs: within a given language, some verbs allow A-bar movement and others do not; in addition, the set of verbs that allow A-bar movement varies across languages. I present an acceptability judgment experiment that is aimed at clarifying existing generalizations about bridge verbs in English. With more secure generalizations in hand, I discuss the extent to which bridge effects have a pragmatic origin, bringing in data from an informal survey of English and Dutch native speakers that looks at the effect of context on long-distance A-bar movement. Echoing existing work, the survey shows what appears to be a case of cross-linguistic variation between English and some Dutch varieties for cognitive factive verbs. To account for this instance of cross-linguistic variation, I suggest that English learners might have limited access to direct evidence, and discuss what learning mechanisms a learner might need to draw the language-appropriate conclusions based on sparse evidence. Chapter 5 discusses the consequences these case studies have for our formal accounts of these constraints. I evaluate existing proposals and argue that the range of variation observed requires more flexibility than what many existing proposals can offer. Chapter 6 concludes

    Unsupervised Induction of Frame-Based Linguistic Forms

    Get PDF
    This thesis studies the use of bulk, structured, linguistic annotations in order to perform unsupervised induction of meaning for three kinds of linguistic forms: words, sentences, and documents. The primary linguistic annotation I consider throughout this thesis are frames, which encode core linguistic, background or societal knowledge necessary to understand abstract concepts and real-world situations. I begin with an overview of linguistically-based structured meaning representation; I then analyze available large-scale natural language processing (NLP) and linguistic resources and corpora for their abilities to accommodate bulk, automatically-obtained frame annotations. I then proceed to induce meanings of the different forms, progressing from the word level, to the sentence level, and finally to the document level. I first show how to use these bulk annotations in order to better encode linguistic- and cognitive science backed semantic expectations within word forms. I then demonstrate a straightforward approach for learning large lexicalized and refined syntactic fragments, which encode and memoize commonly used phrases and linguistic constructions. Next, I consider two unsupervised models for document and discourse understanding; one is a purely generative approach that naturally accommodates layer annotations and is the first to capture and unify a complete frame hierarchy. The other conditions on limited amounts of external annotations, imputing missing values when necessary, and can more readily scale to large corpora. These discourse models help improve document understanding and type-level understanding
    • โ€ฆ
    corecore