41 research outputs found
Acquisition et évaluation sur corpus de propriétés de sous-catégorisation syntaxique
We carry out an experiment aimed at using subcategorization information into a syntactic parser for PP attachment disambiguation. The subcategorization lexicon consists of probabilities between a word (verb, noun, adjective) and a preposition. The lexicon is acquired automatically from a 200 million word corpus, that is partially tagged and parsed. In order to assess the lexicon, we use 4 different corpora in terms of genre and domain. We D. Bourigault, C. Frérot assess various methods for PP attachment disambiguation : an exogeous method relies on the sub-categorization lexicon whereas an endogenous method relies on the corpus specific ressource only and an hybrid method makes use of both. The hybrid method proves to be the best and the results vary from 79.4 % to 87.2 %
Integrating controlled corpus data in the classroom: A case-study of English NPs for French students in specialised translation
International audienceThis paper looks at the alternation of two complex English noun phrases in scientific English, which poses a challenge to French students in the specialised translation classroom. Indeed, no such alternation is observed in French. Starting from a preliminary study of a first series of constructions, we seek confirmation for generalisations about the constructions' preferred context of occurrence in a new sample of highly frequent constructions. We then discuss how the results of those analyses can be integrated in the translation classroom, through a new online tool aimed at raising students' awareness of this contrastive problem and helping them choose one or the other construction according to a set of corpus-based clues
Syntex, analyseur syntaxique de corpus
Cet article est un document de présentation de l'analyseur syntaxique de corpus Syntex, dans lequel nous décrivons les principes à la base du développement de l'analyseur et son architecture informatique. Une bibliographie du projet SYNTEX est donnée à la fin du document
Corpora and Corpus Technology for Translation Purposes in Professional and Academic Environments. Major Achievements and New Perspectives
International audienc
Parallel Corpora for Translation Teaching and Translator Training Purposes. Lodz Studies in Language
International audienc
Corpora and corpus technology for translation purposes in professional and academic environments. Major achievements and new perspectives
The “use” of corpora and concordancers in translation teaching has grown increasingly attractive since the mid1990s’ with an abundant literature advocating their use and promoting their benefits in the translation classroom. In translator training, efforts are being made to incorporate the use of corpora and concordancers in masters’ programmes and to offer specific modules on corpora for translation as the use of translation memory (TM) systems within Computer-Aided Translation (CAT) courses still dominates. In the translation profession, while TM systems are part of the everyday working environment, the same cannot be said of corpora and concordancers even though the most recent surveys show that professional translators would like to learn more about the potential of corpora for translation. Overall, the “usefulness” of corpora and corpus technology at the different stages of the translation process remains poorly documented in translation but a growing number of empirical studies has started to show concern as it has now become of paramount importance to assess the extent to which corpora are of added value for translation quality in both professional and academic environments.Desde mediados de los 90 el “uso” de corpus y programas de concordancias se ha vuelto cada vez más atractivo en la enseñanza de la traducciĂłn, de lo que da fe un abundante volumen de publicaciones que apuestan por ello y promueven sus beneficios en el aula. En la formaciĂłn de traductores se están realizando esfuerzos para incorporar el uso de corpus y programas de concordancias en programas de máster y ofrecer mĂłdulos especĂficos sobre uso de corpus en traducciĂłn, si bien aĂşn domina el uso de memorias de traducciĂłn (MT) en los cursos de TraducciĂłn Asistida por ordenador (TAO). En el mundo profesional de la traducciĂłn, mientras que las MT son parte del entorno de trabajo habitual, no se puede afirmar lo mismo de los corpus y los programas de concordancias, a pesar de que los Ăşltimos estudios muestran que a los traductores les gustarĂa saber más sobre el potencial de los corpus para su trabajo. En general, la “utilidad” de los corpus y la tecnologĂa de corpus aĂşn no está bien documentada en el campo de la traducciĂłn, si bien existe un creciente nĂşmero de estudios empĂricos en los que se ha empezado a mostrar interĂ©s dado que se considera de vital importancia evaluar el valor añadido que aportan los corpus a la calidad de la traducciĂłn tanto en el ámbito profesional como acadĂ©mico