18,899 research outputs found
Identifying Computer-Translated Paragraphs using Coherence Features
We have developed a method for extracting the coherence features from a
paragraph by matching similar words in its sentences. We conducted an
experiment with a parallel German corpus containing 2000 human-created and 2000
machine-translated paragraphs. The result showed that our method achieved the
best performance (accuracy = 72.3%, equal error rate = 29.8%) when it is
compared with previous methods on various computer-generated text including
translation and paper generation (best accuracy = 67.9%, equal error rate =
32.0%). Experiments on Dutch, another rich resource language, and a low
resource one (Japanese) attained similar performances. It demonstrated the
efficiency of the coherence features at distinguishing computer-translated from
human-created paragraphs on diverse languages.Comment: 9 pages, PACLIC 201
ReaderBench, an Environment for Analyzing Text Complexity and Reading Strategies
Session: Educational Data MiningInternational audienceReaderBench is a multi-purpose, multi-lingual and flexible environment that enables the assessment of a wide range of learners' productions and their manipulation by the teacher. ReaderBench allows the assessment of three main textual features: cohesion-based assessment, reading strategies identification and textual complexity evaluation, which have been subject to empirical validations. ReaderBench covers a complete cycle, from the initial complexity assessment of reading materials, the assignment of texts to learners, the capture of metacognitions reflected in one's textual verbalizations and comprehension evaluation, therefore fostering learner's self-regulation process
Detecting Machine-Translated Text using Back Translation
Machine-translated text plays a crucial role in the communication of people
using different languages. However, adversaries can use such text for malicious
purposes such as plagiarism and fake review. The existing methods detected a
machine-translated text only using the text's intrinsic content, but they are
unsuitable for classifying the machine-translated and human-written texts with
the same meanings. We have proposed a method to extract features used to
distinguish machine/human text based on the similarity between the intrinsic
text and its back-translation. The evaluation of detecting translated sentences
with French shows that our method achieves 75.0% of both accuracy and F-score.
It outperforms the existing methods whose the best accuracy is 62.8% and the
F-score is 62.7%. The proposed method even detects more efficiently the
back-translated text with 83.4% of accuracy, which is higher than 66.7% of the
best previous accuracy. We also achieve similar results not only with F-score
but also with similar experiments related to Japanese. Moreover, we prove that
our detector can recognize both machine-translated and machine-back-translated
texts without the language information which is used to generate these machine
texts. It demonstrates the persistence of our method in various applications in
both low- and rich-resource languages.Comment: INLG 2019, 9 page
The process-genre approach in paragraph writing of fourth-grade EFL learner
129 Páginas.La escritura juega un papel crucial en el aprendizaje de idiomas. Estudios previos han descubierto la efectividad que diversos enfoques tienen para desarrollar habilidades de escritura con el fin de optimizar la competencia escritora de los estudiantes; sin embargo, se le ha dado poca importancia a la escritura de párrafos en estudiantes de primaria. El presente estudio de investigación-acción se valió de artefactos, diario de docente, diario de estudiantes y entrevista grupal para recolectar datos acerca de la manera en la que el enfoque de proceso y género ayudó a estudiantes de inglés como lengua extranjera en cuarto grado de primaria a escribir párrafos narrativos bien estructurados. Los datos fueron analizados a la luz de la teoría fundamentada revelando que la mayoría de los participantes logró escribir párrafos narrativos bien estructurados en los que desarrollaron solo una idea principal sin desviarse del tema. Estos participantes, además, lograron concientizarse sobre el rol de la audiencia (los lectores) y las características del género narrativo de escritura. Todo esto justifica la noción que el enfoque de proceso y género es un método efectivo para que los estudiantes de primaria alcancen el objetivo mencionado por lo cual podría ser adoptado por escuelas de primaria en sus cursos de escritura
On the Development and Evaluation of a Brazilian Portuguese Discourse Parser
We present in this paper the development process and the evaluation procedure of a Brazilian Portuguese discourse parser called DiZer. Based on Rhetorical Structure Theory, DiZer is a symbolic cue phrase-based analyzer that makes use of discourse templates learned from a corpus of scientific texts to identify and build the discourse structure of texts. DiZer evaluation shows satisfactory results for scientific and news texts, even tough it was not designed for the latter, which demonstrates DiZer portability.Apresentamos neste artigo o processo de desenvolvimento e avaliação de um analisador discursivo automático para o português brasileiro. Seguindo a Teoria de Estruturação Retórica, o DiZer é um sistema simbólico baseado na ocorrência de marcadores textuais, fazendo uso de templates discursivos extraídos de um corpus de textos científicos para identificar a construir a estrutura discursiva de textos. A avaliação do DiZer mostra resultados satisfatórios para textos científicos e jornalísticos, apesar do sistema não ter sido delineado para o gênero jornalístico, o que demonstra a portabilidade do sistema
- …