18,899 research outputs found

    Identifying Computer-Translated Paragraphs using Coherence Features

    Get PDF
    We have developed a method for extracting the coherence features from a paragraph by matching similar words in its sentences. We conducted an experiment with a parallel German corpus containing 2000 human-created and 2000 machine-translated paragraphs. The result showed that our method achieved the best performance (accuracy = 72.3%, equal error rate = 29.8%) when it is compared with previous methods on various computer-generated text including translation and paper generation (best accuracy = 67.9%, equal error rate = 32.0%). Experiments on Dutch, another rich resource language, and a low resource one (Japanese) attained similar performances. It demonstrated the efficiency of the coherence features at distinguishing computer-translated from human-created paragraphs on diverse languages.Comment: 9 pages, PACLIC 201

    Identifying Adversarial Sentences by Analyzing Text Complexity

    Get PDF

    ReaderBench, an Environment for Analyzing Text Complexity and Reading Strategies

    Get PDF
    Session: Educational Data MiningInternational audienceReaderBench is a multi-purpose, multi-lingual and flexible environment that enables the assessment of a wide range of learners' productions and their manipulation by the teacher. ReaderBench allows the assessment of three main textual features: cohesion-based assessment, reading strategies identification and textual complexity evaluation, which have been subject to empirical validations. ReaderBench covers a complete cycle, from the initial complexity assessment of reading materials, the assignment of texts to learners, the capture of metacognitions reflected in one's textual verbalizations and comprehension evaluation, therefore fostering learner's self-regulation process

    Detecting Machine-Translated Text using Back Translation

    Full text link
    Machine-translated text plays a crucial role in the communication of people using different languages. However, adversaries can use such text for malicious purposes such as plagiarism and fake review. The existing methods detected a machine-translated text only using the text's intrinsic content, but they are unsuitable for classifying the machine-translated and human-written texts with the same meanings. We have proposed a method to extract features used to distinguish machine/human text based on the similarity between the intrinsic text and its back-translation. The evaluation of detecting translated sentences with French shows that our method achieves 75.0% of both accuracy and F-score. It outperforms the existing methods whose the best accuracy is 62.8% and the F-score is 62.7%. The proposed method even detects more efficiently the back-translated text with 83.4% of accuracy, which is higher than 66.7% of the best previous accuracy. We also achieve similar results not only with F-score but also with similar experiments related to Japanese. Moreover, we prove that our detector can recognize both machine-translated and machine-back-translated texts without the language information which is used to generate these machine texts. It demonstrates the persistence of our method in various applications in both low- and rich-resource languages.Comment: INLG 2019, 9 page

    The process-genre approach in paragraph writing of fourth-grade EFL learner

    Get PDF
    129 Páginas.La escritura juega un papel crucial en el aprendizaje de idiomas. Estudios previos han descubierto la efectividad que diversos enfoques tienen para desarrollar habilidades de escritura con el fin de optimizar la competencia escritora de los estudiantes; sin embargo, se le ha dado poca importancia a la escritura de párrafos en estudiantes de primaria. El presente estudio de investigación-acción se valió de artefactos, diario de docente, diario de estudiantes y entrevista grupal para recolectar datos acerca de la manera en la que el enfoque de proceso y género ayudó a estudiantes de inglés como lengua extranjera en cuarto grado de primaria a escribir párrafos narrativos bien estructurados. Los datos fueron analizados a la luz de la teoría fundamentada revelando que la mayoría de los participantes logró escribir párrafos narrativos bien estructurados en los que desarrollaron solo una idea principal sin desviarse del tema. Estos participantes, además, lograron concientizarse sobre el rol de la audiencia (los lectores) y las características del género narrativo de escritura. Todo esto justifica la noción que el enfoque de proceso y género es un método efectivo para que los estudiantes de primaria alcancen el objetivo mencionado por lo cual podría ser adoptado por escuelas de primaria en sus cursos de escritura

    On the Development and Evaluation of a Brazilian Portuguese Discourse Parser

    Get PDF
    We present in this paper the development process and the evaluation procedure of a Brazilian Portuguese discourse parser called DiZer. Based on Rhetorical Structure Theory, DiZer is a symbolic cue phrase-based analyzer that makes use of discourse templates learned from a corpus of scientific texts to identify and build the discourse structure of texts. DiZer evaluation shows satisfactory results for scientific and news texts, even tough it was not designed for the latter, which demonstrates DiZer portability.Apresentamos neste artigo o processo de desenvolvimento e avaliação de um analisador discursivo automático para o português brasileiro. Seguindo a Teoria de Estruturação Retórica, o DiZer é um sistema simbólico baseado na ocorrência de marcadores textuais, fazendo uso de templates discursivos extraídos de um corpus de textos científicos para identificar a construir a estrutura discursiva de textos. A avaliação do DiZer mostra resultados satisfatórios para textos científicos e jornalísticos, apesar do sistema não ter sido delineado para o gênero jornalístico, o que demonstra a portabilidade do sistema
    corecore