818 research outputs found
Classifier-Based Text Simplification for Improved Machine Translation
Machine Translation is one of the research fields of Computational
Linguistics. The objective of many MT Researchers is to develop an MT System
that produce good quality and high accuracy output translations and which also
covers maximum language pairs. As internet and Globalization is increasing day
by day, we need a way that improves the quality of translation. For this
reason, we have developed a Classifier based Text Simplification Model for
English-Hindi Machine Translation Systems. We have used support vector machines
and Na\"ive Bayes Classifier to develop this model. We have also evaluated the
performance of these classifiers.Comment: In Proceedings of International Conference on Advances in Computer
Engineering and Applications 201
Lexico-syntactic Text Simplification And Compression With Typed Dependencies
We describe two systems for text simplification using typed dependency structures, one that performs lexical and syntactic simplification, and another that performs sentence compression optimised to satisfy global text constraints such as lexical density, the ratio of difficult words, and text length. We report a substantial evaluation that demonstrates the superiority of our systems, individually and in combination, over the state of the art, and also report a comprehension based evaluation of contemporary automatic text simplification systems with target non-native readers
Graph-to-Sequence Learning using Gated Graph Neural Networks
Many NLP applications can be framed as a graph-to-sequence learning problem.
Previous work proposing neural architectures on this setting obtained promising
results compared to grammar-based approaches but still rely on linearisation
heuristics and/or standard recurrent networks to achieve the best performance.
In this work, we propose a new model that encodes the full structural
information contained in the graph. Our architecture couples the recently
proposed Gated Graph Neural Networks with an input transformation that allows
nodes and edges to have their own hidden representations, while tackling the
parameter explosion problem present in previous work. Experimental results show
that our model outperforms strong baselines in generation from AMR graphs and
syntax-based neural machine translation.Comment: ACL 201
Monadic second order finite satisfiability and unbounded tree-width
The finite satisfiability problem of monadic second order logic is decidable
only on classes of structures of bounded tree-width by the classic result of
Seese (1991). We prove the following problem is decidable:
Input: (i) A monadic second order logic sentence , and (ii) a
sentence in the two-variable fragment of first order logic extended
with counting quantifiers. The vocabularies of and may
intersect.
Output: Is there a finite structure which satisfies such
that the restriction of the structure to the vocabulary of has bounded
tree-width? (The tree-width of the desired structure is not bounded.)
As a consequence, we prove the decidability of the satisfiability problem by
a finite structure of bounded tree-width of a logic extending monadic second
order logic with linear cardinality constraints of the form
, where the and
are monadic second order variables. We prove the decidability of a similar
extension of WS1S
Vicinity-driven paragraph and sentence alignment for comparable corpora
Parallel corpora have driven great progress in the field of Text Simplification. However, most sentence alignment algorithms either offer a limited range of alignment types supported, or simply ignore valuable clues present in comparable documents. We address this problem by introducing a new set of flexible vicinity-driven paragraph and sentence alignment algorithms that 1-N, N-1, N-N and long distance null alignments without the need for hard-to-replicate supervised models
Content Differences in Syntactic and Semantic Representations
Syntactic analysis plays an important role in semantic parsing, but the
nature of this role remains a topic of ongoing debate. The debate has been
constrained by the scarcity of empirical comparative studies between syntactic
and semantic schemes, which hinders the development of parsing methods informed
by the details of target schemes and constructions. We target this gap, and
take Universal Dependencies (UD) and UCCA as a test case. After abstracting
away from differences of convention or formalism, we find that most content
divergences can be ascribed to: (1) UCCA's distinction between a Scene and a
non-Scene; (2) UCCA's distinction between primary relations, secondary ones and
participants; (3) different treatment of multi-word expressions, and (4)
different treatment of inter-clause linkage. We further discuss the long tail
of cases where the two schemes take markedly different approaches. Finally, we
show that the proposed comparison methodology can be used for fine-grained
evaluation of UCCA parsing, highlighting both challenges and potential sources
for improvement. The substantial differences between the schemes suggest that
semantic parsers are likely to benefit downstream text understanding
applications beyond their syntactic counterparts.Comment: NAACL-HLT 2019 camera read
- …