818 research outputs found

    Classifier-Based Text Simplification for Improved Machine Translation

    Full text link
    Machine Translation is one of the research fields of Computational Linguistics. The objective of many MT Researchers is to develop an MT System that produce good quality and high accuracy output translations and which also covers maximum language pairs. As internet and Globalization is increasing day by day, we need a way that improves the quality of translation. For this reason, we have developed a Classifier based Text Simplification Model for English-Hindi Machine Translation Systems. We have used support vector machines and Na\"ive Bayes Classifier to develop this model. We have also evaluated the performance of these classifiers.Comment: In Proceedings of International Conference on Advances in Computer Engineering and Applications 201

    Lexico-syntactic Text Simplification And Compression With Typed Dependencies

    Get PDF
    We describe two systems for text simplification using typed dependency structures, one that performs lexical and syntactic simplification, and another that performs sentence compression optimised to satisfy global text constraints such as lexical density, the ratio of difficult words, and text length. We report a substantial evaluation that demonstrates the superiority of our systems, individually and in combination, over the state of the art, and also report a comprehension based evaluation of contemporary automatic text simplification systems with target non-native readers

    Graph-to-Sequence Learning using Gated Graph Neural Networks

    Full text link
    Many NLP applications can be framed as a graph-to-sequence learning problem. Previous work proposing neural architectures on this setting obtained promising results compared to grammar-based approaches but still rely on linearisation heuristics and/or standard recurrent networks to achieve the best performance. In this work, we propose a new model that encodes the full structural information contained in the graph. Our architecture couples the recently proposed Gated Graph Neural Networks with an input transformation that allows nodes and edges to have their own hidden representations, while tackling the parameter explosion problem present in previous work. Experimental results show that our model outperforms strong baselines in generation from AMR graphs and syntax-based neural machine translation.Comment: ACL 201

    Monadic second order finite satisfiability and unbounded tree-width

    Get PDF
    The finite satisfiability problem of monadic second order logic is decidable only on classes of structures of bounded tree-width by the classic result of Seese (1991). We prove the following problem is decidable: Input: (i) A monadic second order logic sentence α\alpha, and (ii) a sentence β\beta in the two-variable fragment of first order logic extended with counting quantifiers. The vocabularies of α\alpha and β\beta may intersect. Output: Is there a finite structure which satisfies α∧β\alpha\land\beta such that the restriction of the structure to the vocabulary of α\alpha has bounded tree-width? (The tree-width of the desired structure is not bounded.) As a consequence, we prove the decidability of the satisfiability problem by a finite structure of bounded tree-width of a logic extending monadic second order logic with linear cardinality constraints of the form ∣X1∣+⋯+∣Xr∣<∣Y1∣+⋯+∣Ys∣|X_{1}|+\cdots+|X_{r}|<|Y_{1}|+\cdots+|Y_{s}|, where the XiX_{i} and YjY_{j} are monadic second order variables. We prove the decidability of a similar extension of WS1S

    Vicinity-driven paragraph and sentence alignment for comparable corpora

    Get PDF
    Parallel corpora have driven great progress in the field of Text Simplification. However, most sentence alignment algorithms either offer a limited range of alignment types supported, or simply ignore valuable clues present in comparable documents. We address this problem by introducing a new set of flexible vicinity-driven paragraph and sentence alignment algorithms that 1-N, N-1, N-N and long distance null alignments without the need for hard-to-replicate supervised models

    Content Differences in Syntactic and Semantic Representations

    Full text link
    Syntactic analysis plays an important role in semantic parsing, but the nature of this role remains a topic of ongoing debate. The debate has been constrained by the scarcity of empirical comparative studies between syntactic and semantic schemes, which hinders the development of parsing methods informed by the details of target schemes and constructions. We target this gap, and take Universal Dependencies (UD) and UCCA as a test case. After abstracting away from differences of convention or formalism, we find that most content divergences can be ascribed to: (1) UCCA's distinction between a Scene and a non-Scene; (2) UCCA's distinction between primary relations, secondary ones and participants; (3) different treatment of multi-word expressions, and (4) different treatment of inter-clause linkage. We further discuss the long tail of cases where the two schemes take markedly different approaches. Finally, we show that the proposed comparison methodology can be used for fine-grained evaluation of UCCA parsing, highlighting both challenges and potential sources for improvement. The substantial differences between the schemes suggest that semantic parsers are likely to benefit downstream text understanding applications beyond their syntactic counterparts.Comment: NAACL-HLT 2019 camera read
    • …
    corecore