Search CORE

7,388 research outputs found

Simple and Effective Text Simplification Using Semantic and Neural Methods

Author: Abend Omri
Rappoport Ari
Sulem Elior
Publication venue
Publication date: 11/10/2018
Field of study

Sentence splitting is a major simplification operator. Here we present a simple and efficient splitting algorithm based on an automatic semantic parser. After splitting, the text is amenable for further fine-tuned simplification operations. In particular, we show that neural Machine Translation can be effectively used in this situation. Previous application of Machine Translation for simplification suffers from a considerable disadvantage in that they are over-conservative, often failing to modify the source in any way. Splitting based on semantic parsing, as proposed here, alleviates this issue. Extensive automatic and human evaluation shows that the proposed method compares favorably to the state-of-the-art in combined lexical and structural simplification

arXiv.org e-Print Archive

Semantic Structural Evaluation for Text Simplification

Author: Abend Omri
Rappoport Ari
Sulem Elior
Publication venue
Publication date: 11/10/2018
Field of study

Current measures for evaluating text simplification systems focus on evaluating lexical text aspects, neglecting its structural aspects. In this paper we propose the first measure to address structural aspects of text simplification, called SAMSA. It leverages recent advances in semantic parsing to assess simplification quality by decomposing the input based on its semantic structure and comparing it to the output. SAMSA provides a reference-less automatic evaluation procedure, avoiding the problems that reference-based methods face due to the vast space of valid simplifications for a given sentence. Our human evaluation experiments show both SAMSA's substantial correlation with human judgments, as well as the deficiency of existing reference-based measures in evaluating structural simplification

arXiv.org e-Print Archive

Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks

Author: Bastings Jasmijn
Marcheggiani Diego
Titov Ivan
Publication venue
Publication date: 20/06/2020
Field of study

Semantic representations have long been argued as potentially useful for enforcing meaning preservation and improving generalization performance of machine translation methods. In this work, we are the first to incorporate information about predicate-argument structure of source sentences (namely, semantic-role representations) into neural machine translation. We use Graph Convolutional Networks (GCNs) to inject a semantic bias into sentence encoders and achieve improvements in BLEU scores over the linguistic-agnostic and syntax-aware versions on the English--German language pair

arXiv.org e-Print Archive

BLEU is Not Suitable for the Evaluation of Text Simplification

Author: Abend Omri
Rappoport Ari
Sulem Elior
Publication venue
Publication date: 01/01/2018
Field of study

BLEU is widely considered to be an informative metric for text-to-text generation, including Text Simplification (TS). TS includes both lexical and structural aspects. In this paper we show that BLEU is not suitable for the evaluation of sentence splitting, the major structural simplification operation. We manually compiled a sentence splitting gold standard corpus containing multiple structural paraphrases, and performed a correlation analysis with human judgments. We find low or no correlation between BLEU and the grammaticality and meaning preservation parameters where sentence splitting is involved. Moreover, BLEU often negatively correlates with simplicity, essentially penalizing simpler sentences.Comment: Accepted to EMNLP 2018 (Short papers

arXiv.org e-Print Archive

SemEval 2019 Shared Task: Cross-lingual Semantic Parsing with UCCA - Call for Participation

Author: Abend Omri
Aizenbud Zohar
Choshen Leshem
Hershcovich Daniel
Rappoport Ari
Sulem Elior
Publication venue
Publication date: 07/10/2020
Field of study

We announce a shared task on UCCA parsing in English, German and French, and call for participants to submit their systems. UCCA is a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), discontinuous structures and non-terminal nodes corresponding to complex semantic units. Given the success of recent semantic parsing shared tasks (on SDP and AMR), we expect the task to have a significant contribution to the advancement of UCCA parsing in particular, and semantic parsing in general. Furthermore, existing applications for semantic evaluation that are based on UCCA will greatly benefit from better automatic methods for UCCA parsing. The competition website is https://competitions.codalab.org/competitions/19160Comment: Superseded by the actual shared task description paper at arXiv:1903.0295

arXiv.org e-Print Archive

Verified AIG Algorithms in ACL2

Author: Davis Jared
Swords Sol
Publication venue: 'Open Publishing Association'
Publication date: 01/04/2013
Field of study

And-Inverter Graphs (AIGs) are a popular way to represent Boolean functions (like circuits). AIG simplification algorithms can dramatically reduce an AIG, and play an important role in modern hardware verification tools like equivalence checkers. In practice, these tricky algorithms are implemented with optimized C or C++ routines with no guarantee of correctness. Meanwhile, many interactive theorem provers can now employ SAT or SMT solvers to automatically solve finite goals, but no theorem prover makes use of these advanced, AIG-based approaches. We have developed two ways to represent AIGs within the ACL2 theorem prover. One representation, Hons-AIGs, is especially convenient to use and reason about. The other, Aignet, is the opposite; it is styled after modern AIG packages and allows for efficient algorithms. We have implemented functions for converting between these representations, random vector simulation, conversion to CNF, etc., and developed reasoning strategies for verifying these algorithms. Aside from these contributions towards verifying AIG algorithms, this work has an immediate, practical benefit for ACL2 users who are using GL to bit-blast finite ACL2 theorems: they can now optionally trust an off-the-shelf SAT solver to carry out the proof, instead of using the built-in BDD package. Looking to the future, it is a first step toward implementing verified AIG simplification algorithms that might further improve GL performance.Comment: In Proceedings ACL2 2013, arXiv:1304.712

arXiv.org e-Print Archive

Directory of Open Access Journals

Formal Ontology Learning on Factual IS-A Corpus in English using Description Logics

Author: Dasgupta Sourish
Majumder Prasenjit
Padia Ankur
Shah Kushal
Publication venue
Publication date: 08/03/2016
Field of study

Ontology Learning (OL) is the computational task of generating a knowledge base in the form of an ontology given an unstructured corpus whose content is in natural language (NL). Several works can be found in this area most of which are limited to statistical and lexico-syntactic pattern matching based techniques Light-Weight OL. These techniques do not lead to very accurate learning mostly because of several linguistic nuances in NL. Formal OL is an alternative (less explored) methodology were deep linguistics analysis is made using theory and tools found in computational linguistics to generate formal axioms and definitions instead simply inducing a taxonomy. In this paper we propose "Description Logic (DL)" based formal OL framework for learning factual IS-A type sentences in English. We claim that semantic construction of IS-A sentences is non trivial. Hence, we also claim that such sentences requires special studies in the context of OL before any truly formal OL can be proposed. We introduce a learner tool, called DLOL_IS-A, that generated such ontologies in the owl format. We have adopted "Gold Standard" based OL evaluation on IS-A rich WCL v.1.1 dataset and our own Community representative IS-A dataset. We observed significant improvement of DLOL_IS-A when compared to the light-weight OL tool Text2Onto and formal OL tool FRED.Comment: This paper has been withdrawn by the author due to requirement of re-evaluation of result

arXiv.org e-Print Archive

Revisiting Elementary Denotational Semantics

Author: Siek Jeremy G.
Publication venue
Publication date: 20/10/2017
Field of study

Operational semantics have been enormously successful, in large part due to its flexibility and simplicity, but they are not compositional. Denotational semantics, on the other hand, are compositional but the lattice-theoretic models are complex and difficult to scale to large languages. However, there are elementary models of the

\lambda

-calculus that are much less complex: by Coppo, Dezani-Ciancaglini, and Salle (1979), Engeler (1981), and Plotkin (1993). This paper takes first steps toward answering the question: can elementary models be good for the day-to-day work of language specification, mechanization, and compiler correctness? The elementary models in the literature are simple, but they are not as intuitive as they could be. To remedy this, we create a new model that represents functions literally as finite graphs. Regarding mechanization, we give the first machine-checked proof of soundness and completeness of an elementary model with respect to an operational semantics. Regarding compiler correctness, we define a polyvariant inliner for the call-by-value

\lambda

-calculus and prove that its output is contextually equivalent to its input. Toward scaling elementary models to larger languages, we formulate our semantics in a monadic style, give a semantics for System F with general recursion, and mechanize the proof of type soundness.Comment: 25 pages, revision of POPL 2018 submission, now under submission to ESOP 201

arXiv.org e-Print Archive

On Formal Reasoning on the Semantics of PLC using Coq

Author: Biha Sidi Ould
Blech Jan Olaf
Publication venue
Publication date: 14/01/2013
Field of study

Programmable Logic Controllers (PLC) and its programming standard IEC 61131-3 are widely used in embedded systems for the industrial automation domain. We propose a framework for the formal treatment of PLC based on the IEC 61131-3 standard. A PLC system description typically combines code written in different languages that are defined in IEC 61131-3. For the top-level specification we regard the Sequential Function Charts (SFC) language, a graphical high-level language that allows to describe the main control-flow of the system. In addition to this, we describe the Instruction List (IL) language -- an assembly like language -- and two other graphical languages: Ladder Diagrams (LD) and Function Block Diagrams (FBD). IL, LD, and FBD are used to describe more low level structures of a PLC. We formalize the semantics of these languages and describe and prove relations between them. Formalization and associated proofs are carried out using the proof assistant Coq. In addition to this, we present work on a tool for automatically generating SFC representations from a graphical description -- the IL and LD languages can be handled in Coq directly -- and its usage for verification purposes. We sketch possible usages of our formal framework, and present an example application for a PLC in a project demonstrator and prove safety properties.Comment: arXiv admin note: text overlap with arXiv:1102.352

arXiv.org e-Print Archive

Head Automata and Bilingual Tiling: Translation with Minimal Representations

Author: Alshawi Hiyan
Publication venue
Publication date: 01/01/1996
Field of study

We present a language model consisting of a collection of costed bidirectional finite state automata associated with the head words of phrases. The model is suitable for incremental application of lexical associations in a dynamic programming search for optimal dependency tree derivations. We also present a model and algorithm for machine translation involving optimal ``tiling'' of a dependency tree with entries of a costed bilingual lexicon. Experimental results are reported comparing methods for assigning cost functions to these models. We conclude with a discussion of the adequacy of annotated linguistic strings as representations for machine translation

arXiv.org e-Print Archive

CiteSeerX