545 research outputs found

    An empirical evaluation of AMR parsing for legal documents

    Full text link
    Many approaches have been proposed to tackle the problem of Abstract Meaning Representation (AMR) parsing, helps solving various natural language processing issues recently. In our paper, we provide an overview of different methods in AMR parsing and their performances when analyzing legal documents. We conduct experiments of different AMR parsers on our annotated dataset extracted from the English version of Japanese Civil Code. Our results show the limitations as well as open a room for improvements of current parsing techniques when applying in this complicated domain

    Selecting and Generating Computational Meaning Representations for Short Texts

    Full text link
    Language conveys meaning, so natural language processing (NLP) requires representations of meaning. This work addresses two broad questions: (1) What meaning representation should we use? and (2) How can we transform text to our chosen meaning representation? In the first part, we explore different meaning representations (MRs) of short texts, ranging from surface forms to deep-learning-based models. We show the advantages and disadvantages of a variety of MRs for summarization, paraphrase detection, and clustering. In the second part, we use SQL as a running example for an in-depth look at how we can parse text into our chosen MR. We examine the text-to-SQL problem from three perspectives—methodology, systems, and applications—and show how each contributes to a fuller understanding of the task.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143967/1/cfdollak_1.pd

    Metrics of Graph-Based Meaning Representations with Applications from Parsing Evaluation to Explainable NLG Evaluation and Semantic Search

    Get PDF
    "Who does what to whom?" The goal of a graph-based meaning representation (in short: MR) is to represent the meaning of a text in a structured format. With an MR, we can explicate the meaning of a text, describe occurring events and entities, and their semantic relations. Thus, a metric of MRs would measure a distance (or similarity) between MRs. We believe that such a meaning-focused similarity measurement can be useful for several important AI tasks, for instance, testing the capability of systems to produce meaningful output (system evaluation), or when searching for similar texts (information retrieval). Moreover, due to the natural explicitness of MRs, we hypothesize that MR metrics could provide us with valuable explainability of their similarity measurement. Indeed, if texts reside in a space where their meaning has been isolated and structured, we might directly see in which aspects two texts are actually similar (or dissimilar). However, we find that there is not much previous work on MR metrics, and thus we lack fundamental knowledge about them and their potential applications. Therefore, we make first steps to explore MR metrics and MR spaces, focusing on two key goals: 1. Develop novel and generally applicable methods for conducting similarity measurements in the space of MRs; 2. Explore potential applications that can profit from similarity assessments in MR spaces, including, but (by far) not limited to, their "classic" purpose of evaluating the quality of a text-to-MR system against a reference (aka parsing evaluation). We start by analyzing contributions from previous works that have proposed MR metrics for parsing evaluation. Then, we move beyond this restricted setup and start to develop novel and more general MR metrics based on i) insights from our analysis of the previous parsing evaluation metrics and ii) our motivation to extend MR metrics to similarity assessment of natural language texts. To empirically evaluate and assess our generalized MR metrics, and to open the door for future improvements, we propose the first benchmark of MR metrics. With our benchmark, we can study MR metrics through the lens of multiple metric-objectives such as sentence similarity and robustness. Then, we investigate novel applications of MR metrics. First, we explore new ways of applying MR metrics to evaluate systems that produce i) text from MRs (MR-to-text evaluation) and ii) MRs from text (MR parsing). We call our new setting MR projection-based, since we presume that one MR (at least) is unobserved and needs to be approximated. An advantage of such projection-based MR metric methods is that we can ablate a costly human reference. Notably, when visiting the MR-to-text scenario, we touch on a much broader application scenario for MR metrics: explainable MR-grounded evaluation of text generation systems. Moving steadily towards the application of MR metrics to general text similarity, we study MR metrics for measuring the meaning similarity of natural language arguments, which is an important task in argument mining, a new and surging area of natural language processing (NLP). In particular, we show that MRs and MR metrics can support an explainable and unsupervised argument similarity analysis and inform us about the quality of argumentative conclusions. Ultimately, we seek even more generality and are also interested in practical aspects such as efficiency. To this aim, we distill our insights from our hitherto explorations into MR metric spaces into an explainable state-of-the-art machine learning model for semantic search, a task for which we would like to achieve high accuracy and great efficiency. To this aim, we develop a controllable metric distillation approach that can explain how the similarity decisions in the neural text embedding space are modulated through interpretable features, while maintaining all efficiency and accuracy (sometimes improving it) of a high-performance neural semantic search method. This is an important contribution, since it shows i) that we can alleviate the efficiency bottleneck of computationally costly MR graph metrics and, vice versa, ii) that MR metrics can help mitigate a crucial limitation of large "black box" neural methods by eliciting explanations for decisions

    Thirty Musts for Meaning Banking

    Full text link
    Meaning banking--creating a semantically annotated corpus for the purpose of semantic parsing or generation--is a challenging task. It is quite simple to come up with a complex meaning representation, but it is hard to design a simple meaning representation that captures many nuances of meaning. This paper lists some lessons learned in nearly ten years of meaning annotation during the development of the Groningen Meaning Bank (Bos et al., 2017) and the Parallel Meaning Bank (Abzianidze et al., 2017). The paper's format is rather unconventional: there is no explicit related work, no methodology section, no results, and no discussion (and the current snippet is not an abstract but actually an introductory preface). Instead, its structure is inspired by work of Traum (2000) and Bender (2013). The list starts with a brief overview of the existing meaning banks (Section 1) and the rest of the items are roughly divided into three groups: corpus collection (Section 2 and 3, annotation methods (Section 4-11), and design of meaning representations (Section 12-30). We hope this overview will give inspiration and guidance in creating improved meaning banks in the future.Comment: https://www.aclweb.org/anthology/W19-3302

    Abstractive news summarization based on event semantic link network

    Get PDF
    This paper studies the abstractive multi-document summarization for event-oriented news texts through event information extraction and abstract representation. Fine-grained event mentions and semantic relations between them are extracted to build a unified and connected event semantic link network, an abstract representation of source texts. A network reduction algorithm is proposed to summarize the most salient and coherent event information. New sentences with good linguistic quality are automatically generated and selected through sentences over-generation and greedy-selection processes. Experimental results on DUC2006 and DUC2007 datasets show that our system significantly outperforms the state-of-the-art extractive and abstractive baselines under both pyramid and ROUGE evaluation metrics

    Text Generation from Discourse Representation Structures

    Get PDF

    Abstract syntax as interlingua: Scaling up the grammatical framework from controlled languages to robust pipelines

    Get PDF
    Syntax is an interlingual representation used in compilers. Grammatical Framework (GF) applies the abstract syntax idea to natural languages. The development of GF started in 1998, first as a tool for controlled language implementations, where it has gained an established position in both academic and commercial projects. GF provides grammar resources for over 40 languages, enabling accurate generation and translation, as well as grammar engineering tools and components for mobile and Web applications. On the research side, the focus in the last ten years has been on scaling up GF to wide-coverage language processing. The concept of abstract syntax offers a unified view on many other approaches: Universal Dependencies, WordNets, FrameNets, Construction Grammars, and Abstract Meaning Representations. This makes it possible for GF to utilize data from the other approaches and to build robust pipelines. In return, GF can contribute to data-driven approaches by methods to transfer resources from one language to others, to augment data by rule-based generation, to check the consistency of hand-annotated corpora, and to pipe analyses into high-precision semantic back ends. This article gives an overview of the use of abstract syntax as interlingua through both established and emerging NLP applications involving GF
    • …
    corecore