31 research outputs found
Les Rectifications de lâorthographe en Belgique francophone : de la politique linguistique aux pratiques des Ă©coliers et de la presse
Cet article illustre diffĂ©rentes actions menĂ©es dans le cadre de la politique linguistique de la FĂ©dĂ©ration Wallonie-Bruxelles et qui touchent aux Rectifications de lâorthographe. Le Conseil de la langue française et de la politique linguistique de la Belgique francophone, avec le support de lâadministration, conduit en effet, depuis sa crĂ©ation, une rĂ©flexion sur une langue plus appropriable, plus proche du citoyen. LâaccĂšs au monde de lâĂ©crit, pour tous, a toujours Ă©tĂ© une de ses prĂ©occupations majeures, et cet accĂšs passe notamment par une orthographe plus rationnelle. Câest dans cette optique que les Rectifications de 1990 ont Ă©tĂ© lâobjet de son attention, et que le Conseil a initiĂ© ou soutenu diffĂ©rentes actions de sensibilisation et des recherches sur le sujet. Dans cet article, nous en mentionnerons trois : une plaquette informative sur le contenu de la rĂ©forme, diffusĂ©e dans tous les Ă©tablissements scolaires ; une campagne de sensibilisation auprĂšs de la presse belge francophone avec la crĂ©ation dâun logiciel qui convertit les textes en nouvelle orthographe ; une Ă©tude sur les pratiques des Ă©lĂšves en fin de scolaritĂ© primaire. Nous verrons Ă©galement, Ă travers lâanalyse dâun corpus dâun million de mots, lâutilisation que fait aujourdâhui la presse belge des graphies rectifiĂ©es.This article illustrates various actions related to the reforms of the French orthography carried out in the context of the language policy of the Wallonia-Brussels Federation. Since its creation, the Council for the French Language and Language Policy of French-speaking Belgium, with the support of the administration, is engaged in a reflection on a more convenient language, closer to the citizen. Access to the written world, for everyone, has always been one of its major concerns, and this access requires, in particular, a more rational spelling. Therefore the 1990 Rectifications have been the subject of its attention, and the Council has initiated or supported various awareness-raising activities and research on this topic. In this article, we shall mention three of them: an information leaflet on the content of the reform, distributed in all schools; an awareness-raising campaign in the French-speaking Belgian press with the creation of software that converts texts into the new spelling; a study on the practices of pupils at the end of primary school. We will also discuss, through the analysis of a corpus of one million words, the use of rectified spellings by the Belgian press
Phraseological sophistication as a multidimensional construct: Exploring the relationship between association, register specificity and frequency
Since Paquot (2019), several unresolved issues have persisted regarding the operationalization of phraseological sophistication in L2 complexity research. One of the most crucial concerns relates to the extent to which the commonly used measures of phraseological sophistication (MI scores) fully represent the intended construct. In this study, we draw upon insights from L2 phraseological research to reexamine the conceptualization and operationalization of phraseological sophistication. We conduct new analyses on the learner corpus used in Paquot (2019), using alternative operationalisations of phraseological sophistication that represent different dimensions of sophistication (based on the register specificity of word combinations and their frequency). Results show that measures representing the dimensions of association (MI scores) and register specificity (ratios of academic collocations) correlate with each other. Frequency-based measures, however, pattern very differently, which we attribute to some issues in the way we operationalized frequency of co-occurrence
Identification d'erreurs de traduction dans un dictionnaire de recherche d'informations translingue et traduction de mots composés à l'aide du World Wide Web
International audienceRĂSUMĂ. La recherche d'informations translingue sur des textes non parallĂšles nĂ©cessite une phase de traduction entre une requĂȘte dans une langue source et un document dans une langue cible. Afin d'obtenir les mĂȘmes performances que dans le cas d'une requĂȘte monolingue sur un document dans la mĂȘme langue que cette requĂȘte, il est nĂ©cessaire de trouver les bonnes traductions pour tous les termes de la requĂȘte en langue source. Malheureusement, les dictionnaires de traduction disponibles ne contiennent pas les traduc-tions exactes d'un grand nombre de mots composĂ©s qui peuvent ĂȘtre prĂ©sents dans une requĂȘte. Les systĂšmes de recherche translingues utilisent des dictionnaires de traduction construits sta-tistiquement ou manuellement. Afin de traduire un mot composĂ©, beaucoup de ces systĂšmes gĂ©nĂšrent toutes les traductions possibles mot Ă mot et vĂ©rifient la prĂ©sence de ces traductions dans la base de donnĂ©e cible. La qualitĂ© de la recherche augmente lorsque il est possible d'uti-liser des traductions de mots composĂ©s prĂ©alablement validĂ©es. Il reste cependant deux problĂšmes encore non rĂ©solus avec cette mĂ©thode consistant Ă gĂ©nĂ©rer et Ă valider toutes les traductions : (1) Si la traduction exacte d'un Ă©lĂ©ment d'un mot composĂ© ne figure pas dans le dictionnaire de traduction, la traduction qui sera validĂ©e par cette mĂ©-thode ne sera pas la meilleure traduction. (2) Si la bonne traduction ne comprend pas le mĂȘme nombre d'Ă©lĂ©ments que le mot composĂ© source, la meilleure traduction ne sera pas non plus gĂ©nĂ©rĂ©e. Dans cet article, nous proposons deux mĂ©thodes pour identifier ces situations. ABSTRACT. Cross-language information retrieval over non parallel text requires a translation phase between a source language query and a target language document. In order to achieve the same performance as a monolingual target language query, good translations for all terms CORIA 05 France-Grenoble-9-11 mars 200
The role of the reference corpus in studies of EFL learnersâ use of statistical collocations
In learner corpus research (LCR), there has been a recent boom in the number of studies that have investigated English as a Foreign Language (EFL) learnersâ use of statistical collocations (e.g. Bestgen & Granger, 2014; Granger & Bestgen, 2014; Paquot & Naets, 2015; Paquot, forthcoming a & b). These studies have adopted an approach first put forward by Schmitt and colleagues (e.g. Durrant & Schmitt, 2009) to assess whether and to what extent the word combinations used by learners are ânative-likeâ by assigning to each pair of words in a learner text an association score (typically a pointwise mutual information and/or a t-score) computed on the basis of a large reference corpus. The reference corpus differs across studies. Thus, Granger & Bestgen (2014) made use of the British National Corpus (BNC) to evaluate EFL learnersâ use of bigrams in the International Corpus of Learner English (Granger et al, 2009); Paquot (forthcoming a & b) extracted statistical collocations from the L2 Research Corpus (L2RC), i.e. a large specialized corpus of research articles in applied linguistics, to assess learnersâ use of adjective + noun and verb + object combinations in term papers in linguistics written by French EFL learners sampled from the Varieties of English for Specific Purposes Database (VESPA); and Paquot & Naets (2015) used the web corpus ENCOW14 (http://corporafromtheweb.org/encow14/) to analyze statistical collocations in the Longitudinal Database of Learner English (LONGDALE, Meunier, 2015). The main objective of this study is to investigate the role of the reference corpus in LCR studies of statistical collocations in learner writing. It is driven by the following research questions: - To what extent are results replicable if another reference corpus is used to calculate association scores? - Depending on the learner corpus data investigated, should we use a general reference corpus or a specialized corpus to compute association measures? To answer our research questions, we replicate the method used in Paquot & Naets (2015) and Paquot (forthcoming a & b): we extract relational co-occurrences (i.e. adjective + noun, adverb + adjective, adverb + verb and verb + direct object relations) from dependency parsed versions of the BNC, ENCOW14 and L2RC and compute their mutual information (MI) scores with the Ngram Statistics Package (NSP). We then use MI scores computed on the basis of the three reference corpora to analyze the same relational co-occurrences in learner texts rated at different CEFR levels (i.e. B2, C1, C2) sampled from ICLE and VESPA. We compute mean MI scores for each dependency relation in each learner text (cf. Bestgen & Granger, 2014) and compare their distribution across proficiency levels. Distributions in the CEFR-based learner data sets are tested for normality and accordingly compared with ANOVAs followed by Tuckey contrasts or Kruskal-Wallis rank sum tests followed by pairwise comparisons using Wilcoxon rank sum tests. Preliminary results confirm previous research by demonstrating that the more advanced learners use more native-like collocations irrespective of the reference corpus. However, MI scores computed on the basis of the three different reference corpora seem to reveal different aspects of phraseological proficiency in learner writing, most notably the use of general vs. genre-specific collocations. References Bestgen, Y. & Granger, S. (2014). Quantifying the development of phraseological competence in L2 English writing: An automated approach. Journal of Second Language Writing, 26, 28â41. Durrant, P., & Schmitt, N. (2009). To what extent do native and non-native writers make use of collocations? IRAL - International Review of Applied Linguistics in Language Teaching, 47(2), 157â177. doi:10.1515/iral.2009.007 Granger, S. & Bestgen, Y. (2014). The use of collocations by intermediate vs. advanced non-native writers: A bigram-based study. International Review of Applied Linguistics in Language Teaching, 52(3): 229-252. Granger, S., Dagneaux, E., Meunier, F. & Paquot, M. (2009). The International Corpus of Learner English. Version 2. Handbook & CD-ROM. Louvain-la-Neuve: Presses universitaires de Louvain, 2009. Meunier, F. (2015) Introduction to the LONGDALE project. In E. Castello, K. Ackerley, & F. Coccetta (Eds.) Studies in Learner Corpus Linguistics: Research and Applications for Foreign Language Teaching and Assessment. Bern: Peter Lang. Paquot, M. (forthcoming a). Phraseological competence: a missing component in university entrance language tests? Insights from a study of EFL learnersâ use of statistical collocations. Language Assessment Quarterly. Paquot, M. (forthcoming b). The phraseological dimension in interlanguage complexity research. Second Language Research. Paquot, M., HasselgĂ„rd, H. & S. Oksefjell Ebeling (2013). Writer/reader visibility in learner writing across genres: A comparison of the French and Norwegian components of the ICLE and VESPA learner corpora. In S. Granger, G. Gilquin & F. Meunier (Eds) Twenty Years of Learner Corpus Research: Looking back, Moving ahead. Corpora and Language in Use â Proceedings 1, Louvain-la-Neuve: Presses universitaires de Louvain. Paquot, M. & Naets, H. (2015). Adopting a relational model of co-occurrences to trace phraseological development. Paper presented at the 3rd Learner Corpus Research Conference, 11-13 September 2015, The Netherlands
Using relational co-occurrences to trace phraseological development in a longitudinal corpus
L2 research has witnessed a boom in the number of studies that investigate learnersâ use of multi-word combinations with the help of measures of association strength such as the mutual information (MI) score (e.g. Durrant & Schmitt, 2009; Li & Schmitt, 2010; Granger & Bestgen, 2014). Most studies so far, however, have investigated positional co-occurrences, where words are said to co-occur when they appear within a certain distance from each other (Evert, 2004), and focused more particularly on adjacent word combinations such as adjective + noun combinations. Paquot (2014) is to the best of our knowledge the first study that adopted a relational model of co-occurrences, where the co-occurring words appear in a specific structural relation, to compare three learner sub-corpora made up of texts rated at different CEFR levels (i.e. B2, C1 and C2). She made use of the Stanford CoreNLP suite of tools to parse learner data and extract dependency relations such as dobj(win,lottery), i.e. âthe direct object of win is lotteryâ, and then used MI score computed on the basis of a large reference corpus to analyse pairs of words in specific grammatical relations. Findings showed that adjective + noun relations discriminated well between B2 and C2 levels; adverbial modifiers separated out B2 texts from the C1 and C2 texts; and verb + direct object relations set C2 texts apart from B2 and C1 texts. These results suggest that, used together, phraseological indices computed on the basis of relational dependencies are able to gauge language proficiency. The main objective of this study is to investigate whether relational co-occurrences also constitute valid indices of phraseological development. To do so, we replicate the method used in Paquot (2014) on data from the Longitudinal Database of Learner English (LONGDALE, Meunier 2013). In the LONGDALE project, the same students are followed over a period of at least three years and data collections are typically organized once per year. The 78 argumentative essays selected for this study were written by 39 French learners of English in Year 1 and Year 3 of their studies at the University of Louvain. Unlike in Year 2, students were requested to write on the same topic in Year 1 and Year 3, which allows us to control for topic, a variable that has been shown to considerably influence learnersâ use of word combinations (e.g. Cortes, 2004). Relational co-occurrences are operationalized in the form of word combinations used in four grammatical relations, i.e. adjective + noun, adverb + adjective, adverb + verb and verb + direct object. We assign to each pair of words in the LONGDALE corpus its MI score computed on the basis of the British National Corpus, and compute mean MI scores for each dependency relations in each learner text. To explore the links between individual and group phraseological development trajectories, a detailed variability analysis using the method of individual profiling and visualization techniques will also be presented (cf. Verspoor & Smiskova, 2012)