Search CORE

18,899 research outputs found

Identifying Computer-Translated Paragraphs using Coherence Features

Author: Echizen Isao
H. Nguyen Huy
Nguyen-Son Hoang-Quoc
T. Tieu Ngoc-Dung
Yamagishi Junichi
Publication venue
Publication date: 03/12/2018
Field of study

We have developed a method for extracting the coherence features from a paragraph by matching similar words in its sentences. We conducted an experiment with a parallel German corpus containing 2000 human-created and 2000 machine-translated paragraphs. The result showed that our method achieved the best performance (accuracy = 72.3%, equal error rate = 29.8%) when it is compared with previous methods on various computer-generated text including translation and paper generation (best accuracy = 67.9%, equal error rate = 32.0%). Experiments on Dutch, another rich resource language, and a low resource one (Japanese) attained similar performances. It demonstrated the efficiency of the coherence features at distinguishing computer-translated from human-created paragraphs on diverse languages.Comment: 9 pages, PACLIC 201

arXiv.org e-Print Archive

Edinburgh Research Explorer

Identifying Adversarial Sentences by Analyzing Text Complexity

Author: Hidano Seira
Kiyomoto Shinsaku
Nguyen-Son Hoang-Quoc
Thao Tran Phuong
Publication venue: Waseda Institute for the Study of Language and Information
Publication date: 01/01/2019
Field of study

Waseda University Repository

ReaderBench, an Environment for Analyzing Text Complexity and Reading Strategies

Author: Bianco Maryse
Dascalu Mihai
Dascălu Mihai
Dessus Philippe
Nardy Aurélie
Trausan-Matu Stefan
Trăușan-Matu Ștefan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/07/2013
Field of study

Session: Educational Data MiningInternational audienceReaderBench is a multi-purpose, multi-lingual and flexible environment that enables the assessment of a wide range of learners' productions and their manipulation by the teacher. ReaderBench allows the assessment of three main textual features: cohesion-based assessment, reading strategies identification and textual complexity evaluation, which have been subject to empirical validations. ReaderBench covers a complete cycle, from the initial complexity assessment of reading materials, the assignment of texts to learners, the capture of metacognitions reflected in one's textual verbalizations and comprehension evaluation, therefore fostering learner's self-regulation process

Hal - Université Grenoble Alpes

Detecting Machine-Translated Text using Back Translation

Author: Hidano Seira
Kiyomoto Shinsaku
Nguyen-Son Hoang-Quoc
Thao Tran Phuong
Publication venue
Publication date: 01/01/2019
Field of study

Machine-translated text plays a crucial role in the communication of people using different languages. However, adversaries can use such text for malicious purposes such as plagiarism and fake review. The existing methods detected a machine-translated text only using the text's intrinsic content, but they are unsuitable for classifying the machine-translated and human-written texts with the same meanings. We have proposed a method to extract features used to distinguish machine/human text based on the similarity between the intrinsic text and its back-translation. The evaluation of detecting translated sentences with French shows that our method achieves 75.0% of both accuracy and F-score. It outperforms the existing methods whose the best accuracy is 62.8% and the F-score is 62.7%. The proposed method even detects more efficiently the back-translated text with 83.4% of accuracy, which is higher than 66.7% of the best previous accuracy. We also achieve similar results not only with F-score but also with similar experiments related to Japanese. Moreover, we prove that our detector can recognize both machine-translated and machine-back-translated texts without the language information which is used to generate these machine texts. It demonstrates the persistence of our method in various applications in both low- and rich-resource languages.Comment: INLG 2019, 9 page

arXiv.org e-Print Archive

Crossref

The process-genre approach in paragraph writing of fourth-grade EFL learner

Author: Arteaga Lara Héctor Mauricio
Publication venue: 'Canadian Center of Science and Education'
Publication date: 02/11/2017
Field of study

129 Páginas.La escritura juega un papel crucial en el aprendizaje de idiomas. Estudios previos han descubierto la efectividad que diversos enfoques tienen para desarrollar habilidades de escritura con el fin de optimizar la competencia escritora de los estudiantes; sin embargo, se le ha dado poca importancia a la escritura de párrafos en estudiantes de primaria. El presente estudio de investigación-acción se valió de artefactos, diario de docente, diario de estudiantes y entrevista grupal para recolectar datos acerca de la manera en la que el enfoque de proceso y género ayudó a estudiantes de inglés como lengua extranjera en cuarto grado de primaria a escribir párrafos narrativos bien estructurados. Los datos fueron analizados a la luz de la teoría fundamentada revelando que la mayoría de los participantes logró escribir párrafos narrativos bien estructurados en los que desarrollaron solo una idea principal sin desviarse del tema. Estos participantes, además, lograron concientizarse sobre el rol de la audiencia (los lectores) y las características del género narrativo de escritura. Todo esto justifica la noción que el enfoque de proceso y género es un método efectivo para que los estudiantes de primaria alcancen el objetivo mencionado por lo cual podría ser adoptado por escuelas de primaria en sus cursos de escritura

Intellectum

On the Development and Evaluation of a Brazilian Portuguese Discourse Parser

Author: Salgueiro Pardo Thiago Alexandre
Volpe Nunes Maria das Graças
Publication venue: 'Universidade Federal do Rio Grande do Sul'
Publication date: 12/12/2008
Field of study

We present in this paper the development process and the evaluation procedure of a Brazilian Portuguese discourse parser called DiZer. Based on Rhetorical Structure Theory, DiZer is a symbolic cue phrase-based analyzer that makes use of discourse templates learned from a corpus of scientific texts to identify and build the discourse structure of texts. DiZer evaluation shows satisfactory results for scientific and news texts, even tough it was not designed for the latter, which demonstrates DiZer portability.Apresentamos neste artigo o processo de desenvolvimento e avaliação de um analisador discursivo automático para o português brasileiro. Seguindo a Teoria de Estruturação Retórica, o DiZer é um sistema simbólico baseado na ocorrência de marcadores textuais, fazendo uso de templates discursivos extraídos de um corpus de textos científicos para identificar a construir a estrutura discursiva de textos. A avaliação do DiZer mostra resultados satisfatórios para textos científicos e jornalísticos, apesar do sistema não ter sido delineado para o gênero jornalístico, o que demonstra a portabilidade do sistema

Em Questao

Archives of the Faculty of Veterinary Medicine UFRGS