7,388 research outputs found
Simple and Effective Text Simplification Using Semantic and Neural Methods
Sentence splitting is a major simplification operator. Here we present a
simple and efficient splitting algorithm based on an automatic semantic parser.
After splitting, the text is amenable for further fine-tuned simplification
operations. In particular, we show that neural Machine Translation can be
effectively used in this situation. Previous application of Machine Translation
for simplification suffers from a considerable disadvantage in that they are
over-conservative, often failing to modify the source in any way. Splitting
based on semantic parsing, as proposed here, alleviates this issue. Extensive
automatic and human evaluation shows that the proposed method compares
favorably to the state-of-the-art in combined lexical and structural
simplification
Semantic Structural Evaluation for Text Simplification
Current measures for evaluating text simplification systems focus on
evaluating lexical text aspects, neglecting its structural aspects. In this
paper we propose the first measure to address structural aspects of text
simplification, called SAMSA. It leverages recent advances in semantic parsing
to assess simplification quality by decomposing the input based on its semantic
structure and comparing it to the output. SAMSA provides a reference-less
automatic evaluation procedure, avoiding the problems that reference-based
methods face due to the vast space of valid simplifications for a given
sentence. Our human evaluation experiments show both SAMSA's substantial
correlation with human judgments, as well as the deficiency of existing
reference-based measures in evaluating structural simplification
Exploiting Semantics in Neural Machine Translation with Graph Convolutional Networks
Semantic representations have long been argued as potentially useful for
enforcing meaning preservation and improving generalization performance of
machine translation methods. In this work, we are the first to incorporate
information about predicate-argument structure of source sentences (namely,
semantic-role representations) into neural machine translation. We use Graph
Convolutional Networks (GCNs) to inject a semantic bias into sentence encoders
and achieve improvements in BLEU scores over the linguistic-agnostic and
syntax-aware versions on the English--German language pair
BLEU is Not Suitable for the Evaluation of Text Simplification
BLEU is widely considered to be an informative metric for text-to-text
generation, including Text Simplification (TS). TS includes both lexical and
structural aspects. In this paper we show that BLEU is not suitable for the
evaluation of sentence splitting, the major structural simplification
operation. We manually compiled a sentence splitting gold standard corpus
containing multiple structural paraphrases, and performed a correlation
analysis with human judgments. We find low or no correlation between BLEU and
the grammaticality and meaning preservation parameters where sentence splitting
is involved. Moreover, BLEU often negatively correlates with simplicity,
essentially penalizing simpler sentences.Comment: Accepted to EMNLP 2018 (Short papers
SemEval 2019 Shared Task: Cross-lingual Semantic Parsing with UCCA - Call for Participation
We announce a shared task on UCCA parsing in English, German and French, and
call for participants to submit their systems. UCCA is a cross-linguistically
applicable framework for semantic representation, which builds on extensive
typological work and supports rapid annotation. UCCA poses a challenge for
existing parsing techniques, as it exhibits reentrancy (resulting in DAG
structures), discontinuous structures and non-terminal nodes corresponding to
complex semantic units. Given the success of recent semantic parsing shared
tasks (on SDP and AMR), we expect the task to have a significant contribution
to the advancement of UCCA parsing in particular, and semantic parsing in
general. Furthermore, existing applications for semantic evaluation that are
based on UCCA will greatly benefit from better automatic methods for UCCA
parsing. The competition website is
https://competitions.codalab.org/competitions/19160Comment: Superseded by the actual shared task description paper at
arXiv:1903.0295
Verified AIG Algorithms in ACL2
And-Inverter Graphs (AIGs) are a popular way to represent Boolean functions
(like circuits). AIG simplification algorithms can dramatically reduce an AIG,
and play an important role in modern hardware verification tools like
equivalence checkers. In practice, these tricky algorithms are implemented with
optimized C or C++ routines with no guarantee of correctness. Meanwhile, many
interactive theorem provers can now employ SAT or SMT solvers to automatically
solve finite goals, but no theorem prover makes use of these advanced,
AIG-based approaches.
We have developed two ways to represent AIGs within the ACL2 theorem prover.
One representation, Hons-AIGs, is especially convenient to use and reason
about. The other, Aignet, is the opposite; it is styled after modern AIG
packages and allows for efficient algorithms. We have implemented functions for
converting between these representations, random vector simulation, conversion
to CNF, etc., and developed reasoning strategies for verifying these
algorithms.
Aside from these contributions towards verifying AIG algorithms, this work
has an immediate, practical benefit for ACL2 users who are using GL to
bit-blast finite ACL2 theorems: they can now optionally trust an off-the-shelf
SAT solver to carry out the proof, instead of using the built-in BDD package.
Looking to the future, it is a first step toward implementing verified AIG
simplification algorithms that might further improve GL performance.Comment: In Proceedings ACL2 2013, arXiv:1304.712
Formal Ontology Learning on Factual IS-A Corpus in English using Description Logics
Ontology Learning (OL) is the computational task of generating a knowledge
base in the form of an ontology given an unstructured corpus whose content is
in natural language (NL). Several works can be found in this area most of which
are limited to statistical and lexico-syntactic pattern matching based
techniques Light-Weight OL. These techniques do not lead to very accurate
learning mostly because of several linguistic nuances in NL. Formal OL is an
alternative (less explored) methodology were deep linguistics analysis is made
using theory and tools found in computational linguistics to generate formal
axioms and definitions instead simply inducing a taxonomy. In this paper we
propose "Description Logic (DL)" based formal OL framework for learning factual
IS-A type sentences in English. We claim that semantic construction of IS-A
sentences is non trivial. Hence, we also claim that such sentences requires
special studies in the context of OL before any truly formal OL can be
proposed. We introduce a learner tool, called DLOL_IS-A, that generated such
ontologies in the owl format. We have adopted "Gold Standard" based OL
evaluation on IS-A rich WCL v.1.1 dataset and our own Community representative
IS-A dataset. We observed significant improvement of DLOL_IS-A when compared to
the light-weight OL tool Text2Onto and formal OL tool FRED.Comment: This paper has been withdrawn by the author due to requirement of
re-evaluation of result
Revisiting Elementary Denotational Semantics
Operational semantics have been enormously successful, in large part due to
its flexibility and simplicity, but they are not compositional. Denotational
semantics, on the other hand, are compositional but the lattice-theoretic
models are complex and difficult to scale to large languages. However, there
are elementary models of the -calculus that are much less complex: by
Coppo, Dezani-Ciancaglini, and Salle (1979), Engeler (1981), and Plotkin
(1993).
This paper takes first steps toward answering the question: can elementary
models be good for the day-to-day work of language specification,
mechanization, and compiler correctness? The elementary models in the
literature are simple, but they are not as intuitive as they could be. To
remedy this, we create a new model that represents functions literally as
finite graphs. Regarding mechanization, we give the first machine-checked proof
of soundness and completeness of an elementary model with respect to an
operational semantics. Regarding compiler correctness, we define a polyvariant
inliner for the call-by-value -calculus and prove that its output is
contextually equivalent to its input. Toward scaling elementary models to
larger languages, we formulate our semantics in a monadic style, give a
semantics for System F with general recursion, and mechanize the proof of type
soundness.Comment: 25 pages, revision of POPL 2018 submission, now under submission to
ESOP 201
On Formal Reasoning on the Semantics of PLC using Coq
Programmable Logic Controllers (PLC) and its programming standard IEC 61131-3
are widely used in embedded systems for the industrial automation domain. We
propose a framework for the formal treatment of PLC based on the IEC 61131-3
standard. A PLC system description typically combines code written in different
languages that are defined in IEC 61131-3. For the top-level specification we
regard the Sequential Function Charts (SFC) language, a graphical high-level
language that allows to describe the main control-flow of the system. In
addition to this, we describe the Instruction List (IL) language -- an assembly
like language -- and two other graphical languages: Ladder Diagrams (LD) and
Function Block Diagrams (FBD). IL, LD, and FBD are used to describe more low
level structures of a PLC. We formalize the semantics of these languages and
describe and prove relations between them. Formalization and associated proofs
are carried out using the proof assistant Coq. In addition to this, we present
work on a tool for automatically generating SFC representations from a
graphical description -- the IL and LD languages can be handled in Coq directly
-- and its usage for verification purposes. We sketch possible usages of our
formal framework, and present an example application for a PLC in a project
demonstrator and prove safety properties.Comment: arXiv admin note: text overlap with arXiv:1102.352
Head Automata and Bilingual Tiling: Translation with Minimal Representations
We present a language model consisting of a collection of costed
bidirectional finite state automata associated with the head words of phrases.
The model is suitable for incremental application of lexical associations in a
dynamic programming search for optimal dependency tree derivations. We also
present a model and algorithm for machine translation involving optimal
``tiling'' of a dependency tree with entries of a costed bilingual lexicon.
Experimental results are reported comparing methods for assigning cost
functions to these models. We conclude with a discussion of the adequacy of
annotated linguistic strings as representations for machine translation
- …