Search CORE

2,983 research outputs found

Recommended from our members

Learning with Joint Inference and Latent Linguistic Structure in Graphical Models

Author: Narad Jason
Publication venue: ScholarWorks@UMass Amherst
Publication date: 17/03/2015
Field of study

Constructing end-to-end NLP systems requires the processing of many types of linguistic information prior to solving the desired end task. A common approach to this problem is to construct a pipeline, one component for each task, with each system\u27s output becoming input for the next. This approach poses two problems. First, errors propagate, and, much like the childhood game of telephone , combining systems in this manner can lead to unintelligible outcomes. Second, each component task requires annotated training data to act as supervision for training the model. These annotations are often expensive and time-consuming to produce, may differ from each other in genre and style, and may not match the intended application. In this dissertation we present a general framework for constructing and reasoning on joint graphical model formulations of NLP problems. Individual models are composed using weighted Boolean logic constraints, and inference is performed using belief propagation. The systems we develop are composed of two parts: one a representation of syntax, the other a desired end task (semantic role labeling, named entity recognition, or relation extraction). By modeling these problems jointly, both models are trained in a single, integrated process, with uncertainty propagated between them. This mitigates the accumulation of errors typical of pipelined approaches. Additionally we propose a novel marginalization-based training method in which the error signal from end task annotations is used to guide the induction of a constrained latent syntactic representation. This allows training in the absence of syntactic training data, where the latent syntactic structure is instead optimized to best support the end task predictions. We find that across many NLP tasks this training method offers performance comparable to fully supervised training of each individual component, and in some instances improves upon it by learning latent structures which are more appropriate for the task

ScholarWorks@UMass Amherst

Macquarie University ResearchOnline

Proceedings of the COLING 2004 Post Conference Workshop on Multilingual Linguistic Ressources MLR2004

Author: Armstrong Susan
Boitet Christian
Popescu-Belis Andrei
Sérasset Gilles
Tufis Dan
Publication venue: COLING
Publication date: 01/01/2004
Field of study

International audienceIn an ever expanding information society, most information systems are now facing the "multilingual challenge". Multilingual language resources play an essential role in modern information systems. Such resources need to provide information on many languages in a common framework and should be (re)usable in many applications (for automatic or human use). Many centres have been involved in national and international projects dedicated to building har- monised language resources and creating expertise in the maintenance and further development of standardised linguistic data. These resources include dictionaries, lexicons, thesauri, word-nets, and annotated corpora developed along the lines of best practices and recommendations. However, since the late 90's, most efforts in scaling up these resources remain the responsibility of the local authorities, usually, with very low funding (if any) and few opportunities for academic recognition of this work. Hence, it is not surprising that many of the resource holders and developers have become reluctant to give free access to the latest versions of their resources, and their actual status is therefore currently rather unclear. The goal of this workshop is to study problems involved in the development, management and reuse of lexical resources in a multilingual context. Moreover, this workshop provides a forum for reviewing the present state of language resources. The workshop is meant to bring to the international community qualitative and quantitative information about the most recent developments in the area of linguistic resources and their use in applications. The impressive number of submissions (38) to this workshop and in other workshops and conferences dedicated to similar topics proves that dealing with multilingual linguistic ressources has become a very hot problem in the Natural Language Processing community. To cope with the number of submissions, the workshop organising committee decided to accept 16 papers from 10 countries based on the reviewers' recommendations. Six of these papers will be presented in a poster session. The papers constitute a representative selection of current trends in research on Multilingual Language Resources, such as multilingual aligned corpora, bilingual and multilingual lexicons, and multilingual speech resources. The papers also represent a characteristic set of approaches to the development of multilingual language resources, such as automatic extraction of information from corpora, combination and re-use of existing resources, online collaborative development of multilingual lexicons, and use of the Web as a multilingual language resource. The development and management of multilingual language resources is a long-term activity in which collaboration among researchers is essential. We hope that this workshop will gather many researchers involved in such developments and will give them the opportunity to discuss, exchange, compare their approaches and strengthen their collaborations in the field. The organisation of this workshop would have been impossible without the hard work of the program committee who managed to provide accurate reviews on time, on a rather tight schedule. We would also like to thank the Coling 2004 organising committee that made this workshop possible. Finally, we hope that this workshop will yield fruitful results for all participants

Hal - Université Grenoble Alpes

Hal-Diderot

Recommended from our members

Machine Learning Models for Efficient and Robust Natural Language Processing

Author: Strubell Emma
Publication venue: ScholarWorks@UMass Amherst
Publication date: 30/10/2019
Field of study

Natural language processing (NLP) has come of age. For example, semantic role labeling (SRL), which automatically annotates sentences with a labeled graph representing who did what to whom, has in the past ten years seen nearly 40% reduction in error, bringing it to useful accuracy. As a result, a myriad of practitioners now want to deploy NLP systems on billions of documents across many domains. However, state-of-the-art NLP systems are typically not optimized for cross-domain robustness nor computational efficiency. In this dissertation I develop machine learning methods to facilitate fast and robust inference across many common NLP tasks. First, I describe paired learning and inference algorithms for dynamic feature selection which accelerate inference in linear classifiers, the heart of the fastest NLP models, by 5-10 times. I then present iterated dilated convolutional neural networks (ID-CNNs), a distinct combination of network structure, parameter sharing and training procedures that increase inference speed by 14-20 times with accuracy matching bidirectional LSTMs, the most accurate models for NLP sequence labeling. Finally, I describe linguistically-informed self-attention (LISA), a neural network model that combines multi-head self-attention with multi-task learning to facilitate improved generalization to new domains. We show that incorporating linguistic structure in this way leads to substantial improvements over the previous state-of-the-art (syntax-free) neural network models for SRL, especially when evaluating out-of-domain. I conclude with a brief discussion of potential future directions stemming from my thesis work

ScholarWorks@UMass Amherst

CLiFF Notes: Research In Natural Language Processing at the University of Pennsylvania

Author: Graduate Students Faculty &
Publication venue: ScholarlyCommons
Publication date: 01/03/1992
Field of study

The Computational Linguistics Feedback Forum (CLIFF) is a group of students and faculty who gather once a week to discuss the members\u27 current research. As the word feedback suggests, the group\u27s purpose is the sharing of ideas. The group also promotes interdisciplinary contacts between researchers who share an interest in Cognitive Science. There is no single theme describing the research in Natural Language Processing at Penn. There is work done in CCG, Tree adjoining grammars, intonation, statistical methods, plan inference, instruction understanding, incremental interpretation, language acquisition, syntactic parsing, causal reasoning, free word order languages, ... and many other areas. With this in mind, rather than trying to summarize the varied work currently underway here at Penn, we suggest reading the following abstracts to see how the students and faculty themselves describe their work. Their abstracts illustrate the diversity of interests among the researchers, explain the areas of common interest, and describe some very interesting work in Cognitive Science. This report is a collection of abstracts from both faculty and graduate students in Computer Science, Psychology and Linguistics. We pride ourselves on the close working relations between these groups, as we believe that the communication among the different departments and the ongoing inter-departmental research not only improves the quality of our work, but makes much of that work possible

ScholarlyCommons@Penn

Graph-based broad-coverage semantic parsing

Author: Lyu Chunchuan
Publication venue: The University of Edinburgh
Publication date: 31/07/2021
Field of study

Many broad-coverage meaning representations can be characterized as directed graphs, where nodes represent semantic concepts and directed edges represent semantic relations among the concepts. The task of semantic parsing is to generate such a meaning representation from a sentence. It is quite natural to adopt a graph-based approach for parsing, where nodes are identified conditioning on the individual words, and edges are labeled conditioning on the pairs of nodes. However, there are two issues with applying this simple and interpretable graph-based approach for semantic parsing: first, the anchoring of nodes to words can be implicit and non-injective in several formalisms (Oepen et al., 2019, 2020). This means we do not know which nodes should be generated from which individual word and how many of them. Consequently, it makes a probabilistic formulation of the training objective problematical; second, graph-based parsers typically predict edge labels independent from each other. Such an independence assumption, while being sensible from an algorithmic point of view, could limit the expressiveness of statistical modeling. Consequently, it might fail to capture the true distribution of semantic graphs. In this thesis, instead of a pipeline approach to obtain the anchoring, we propose to model the implicit anchoring as a latent variable in a probabilistic model. We induce such a latent variable jointly with the graph-based parser in an end-to-end differentiable training. In particular, we test our method on Abstract Meaning Representation (AMR) parsing (Banarescu et al., 2013). AMR represents sentence meaning with a directed acyclic graph, where the anchoring of nodes to words is implicit and could be many-to-one. Initially, we propose a rule-based system that circumvents the many-to-one anchoring by combing nodes in some pre-specified subgraphs in AMR and treats the alignment as a latent variable. Next, we remove the need for such a rule-based system by treating both graph segmentation and alignment as latent variables. Still, our graph-based parsers are parameterized by neural modules that require gradient-based optimization. Consequently, training graph-based parsers with our discrete latent variables can be challenging. By combing deep variational inference and differentiable sampling, our models can be trained end-to-end. To overcome the limitation of graph-based parsing and capture interdependency in the output, we further adopt iterative refinement. Starting with an output whose parts are independently predicted, we iteratively refine it conditioning on the previous prediction. We test this method on semantic role labeling (Gildea and Jurafsky, 2000). Semantic role labeling is the task of predicting the predicate-argument structure. In particular, semantic roles between the predicate and its arguments need to be labeled, and those semantic roles are interdependent. Overall, our refinement strategy results in an effective model, outperforming strong factorized baseline models

Edinburgh Research Archive

Multi-task Learning for Japanese Predicate Argument Structure Analysis

Author: 大森光
Publication venue
Publication date: 25/03/2019
Field of study

述語項構造解析とはテキスト中に存在する述語とその項との意味構造を解析するタスクである．「誰が」「何を」「誰に」「どうした」のように文の構造を整理することは，機械翻訳や含意関係認識など複雑な文章の解析を必要とする処理のために有用である．述語項構造解析は動詞や形容詞といった述語を対象にし，その項を解析するタスクであるが，文中には名詞でも項を持つものが多く存在する．例えば，「報告」のようなサ変名詞や「救い」のような動詞から派生した名詞がこれにあたり，このような名詞を事態性名詞と呼ぶ．NAISTテキストコーパスでは，述語と事態性名詞はどちらも必須格（ガ格，ヲ格，ニ格）の項を持つ．また飯田ら（2006）によると，述語は同一文節内に項を持つことはほぼないが，事態性名詞の場合はヲ格とニ格の項の約半分が同一文節内に出現する．このことから，事態性名詞の項構造解析と述語項構造解析は関係性が高いが別のタスクであるといえる．先行研究では，機械学習を用いた述語項構造解析の研究が盛んに行われてきた．しかしこれらのほとんどが述語を対象に項の解析を行っており，事態性名詞を対象とした研究は少ない．文章の意味構造を整理し，正しく文脈解析を行ううえで，述語のみを対象にした項構造解析は不十分である．そこで本研究では，述語項構造解析と事態性名詞の項構造解析をマルチタスク学習するモデルを提案する．マルチタスク学習は含意関係認識や意味役割付与など自然言語処理の様々な分野で適用されており，精度の向上が報告されている学習手法である．マルチタスク学習の特徴として，複数のタスクを同時に解くことで学習データが増え，各タスクに含まれるノイズに対して頑健に学習できることが挙げられる．また，タスク間で共通する知識を獲得することで，モデルをより汎化できるのも利点の一つである．NAISTテキストコーパスは機械翻訳などで用いられている大規模なデータセットと比較すると小規模であり，NAISTテキストコーパスにおける事態性名詞のデータ数は述語のデータ数の約3分の1であることから，特に事態性名詞の項構造解析においてマルチタスク学習が効果的であることが期待できる．この提案モデルでは end-to-end な多層双方向 Recurrent Neural Network（RNN）がベースであり，入力層とRNN層と出力層においてタスク間で共通の知識とタスク特有の知識を区別するアーキテクチャーを有する．入力層では，タスク独自の単語ベクトルを学習することで，表層形が同じであっても文脈の異なる述語と事態性名詞の違いを区別する．RNN層では，タスク共有のRNNの上にタスク特有のRNNを階層的に重ねるニューラルネットワークを構築することで，共有のRNNでタスク共有の知識表現を学習し，タスク特有のRNNでそれぞれのタスクに調整する．出力層では，タスク共有の層とタスク特有の層に分けることで，述語と事態性名詞の項の出現位置の違いを考慮するように学習する．本論文では日本語述語項構造解析において一般的なベンチマークとして使われるNAISTテキストコーパスを用いて実験を行った．実験の結果，提案モデルがマルチタスク学習を用いないベースラインモデルと比較して，全体のF値で0.29ポイントの精度向上を示した．また，文内述語項構造解析タスクにおいて最高精度を達成している先行研究のモデルを0.67ポイント上回る解析精度を示した．本論文の貢献は以下の3つである．1. 初めてマルチタスク学習を用いて述語項構造解析と事態性名詞の項構造解析を行い，双方のタスクで解析精度が向上することを示した．2. 統語情報を素性として加えることによって，複数の述語項関係を考慮しない単純なモデルが文内述語項構造解析において最高精度を達成した．3. 初めて深層学習を用いて事態性名詞の項構造解析を行った．本稿の構成は以下のようになっている．第1章では本研究の概要について述べる．第2章では日本語述語項構造解析と意味役割付与についての関連研究について述べる．第3章ではタスク設定と end-to-end モデルの概要及び素性について述べる．第4章では提案手法である述語項構造解析と事態性名詞の項構造解析のマルチタスク学習モデルについて説明する．第5章ではデータセットと実験設定，実験結果について述べる．第6章ではベースラインと提案モデルの出力結果を比較し，各モデルの効果を分析する．最後に，第7章で本研究のまとめと今後の展望について述べる．An event-noun is a noun that has an argument structure similar to a predicate. Recent works, including those considered state-of-the-art, ignore event-nouns or build a single model for solving both Japanese predicate argument structure analysis (PASA) and event-noun argument structure analysis (ENASA). However, because there are interactions between predicates and event-nouns, it is not sufficient to target only predicates. To address this problem, we present a multi-task learning method for PASA and ENASA. Our multi-task models improved the performance of both tasks compared to a single-task model by sharing knowledge from each task. Moreover, in PASA, our models achieved state-of-the-art results in overall F1 scores on the NAIST Text Corpus. In addition, this is the first work to employ neural networks in ENASA.首都大学東京, 2019-03-25, 修士（工学）首都大学東

Tokyo Metropolitan University Institutional Repository Miyako-Dori / 首都大学東京機関リポジトリ