287 research outputs found

    Reflexive pronouns in Spanish Universal Dependencies

    Get PDF
    In this paper, we argue that in current Universal Dependencies treebanks, the annotation of Spanish reflexives is an unsolved problem, which clearly affects the accuracy and consistency of current parsers. We evaluate different proposals for fine-tuning the various categories, and discuss remaining open issues. We believe that the solution for these issues could lie in a multi-layered way of annotating the characteristics, combining annotation of the dependency relation and of the so-called token features, rather than in expanding the number of categories on one layer. We apply this proposal to the v2.5 Spanish UD AnCora treebank and provide a categorized conversion table that can be run with a Python script

    Los pronombres reflexivos en las Universal Dependencies en español: desde la anotación hacia el análisis morfosintáctico automático

    Get PDF
    In this follow-up article of Degraeuwe and Goethals (2020), we present the annotation scheme used to reannotate the 7298 potentially reflexive pronouns included in the Universal Dependencies Spanish AnCora v2.6 treebank, which resulted in significant modifications for the “Case” feature (100% changed) and dependency relations (87% changed). Next, we evaluate the performance of spaCy v3.2.2 and Stanza v1.3.0 (both trained on AnCora v2.8, and thus based on our reannotations) on the AnCora v2.8 test set, which yielded weighted F1 scores up to 0.88 and 0.98 for the “Case” and “Reflex” features, respectively, and up to 0.71 for the dependency relations. Finally, the error analysis of the spaCy results underlines the (generalisation) potential of the model, but also reveals some of the remaining issues in the automatic morphosyntactic analysis of reflexive pronouns in Spanish, such as determining if expletive relations denote an impersonal, passive or inherently reflexive use.En este artículo de seguimiento de Degraeuwe y Goethals (2020), presentamos el esquema de anotación utilizado para reanotar los 7298 pronombres potencialmente reflexivos incluidos en el Universal Dependencies Spanish AnCora v2.6 treebank, lo cual resultó en un significativo número de modificaciones para la característica (feature) de “Case” (el 100% cambiado) y las relaciones de dependencia (el 87% cambiado). A continuación, evaluamos el desempeño de spaCy v3.2.2 y Stanza v1.3.0 (ambos entrenados en AnCora v2.8, y, por tanto, basados en nuestras reanotaciones) en el set de prueba de AnCora v2.8, lo cual dio como resultado puntuaciones de F1 ponderado de hasta 0,88 y 0,98 para las características de “Case” y “Reflex”, respectivamente, y de hasta 0,71 para las relaciones de dependencia. Por último, el análisis de errores de los resultados de spaCy subraya el potencial (generalizador) del modelo, pero también desvela algunos de los problemas pendientes en el análisis morfosintáctico automático de los pronombres reflexivos en español, como por ejemplo determinar si las relaciones de dependencia expletivas son de carácter impersonal, pasivo o inherentemente reflexivo.This research has been carried out as part of a PhD fellowship on the IVESS project (file number 11D3921N), funded by the Research Foundation – Flanders (FWO)

    Measuring the effect of the heat flow through a chimney with different altitudes of a rescaled model and validating the results with TRNSYS, a computer-simulated model

    Full text link
    The thesis subject is: "Building Physics", specifically measuring the natural heat flow in an apartment building. The objective of all the measurements is to verify that, the longer a chimney gets in a building, the higher the flow through that specific chimney gets, this by adjusting different parameters from inside and outside the building, such as: wind velocity, temperature inside, surface of the window opening, etc. The flow was measured in two different ways. In the first an existing model with wooden walls of 1 cm thickness was used to do the measurements on. The model had the following dimensions: 1OO x 79 x 150 centimeters. There were 4 halogen lamps placed in the construction, and the only object that could cause the heat flow was the chimney, made of PVC. Next, the heat flow was measured when 1, 2, 3 or 4 lamps were turned on. This was measured by taking the temperature inside and outside the model and using an air velocity-meter that was placed on top of the chimney. Since there was no wind available, because the measurements were taken in a laboratory, another computer-simulated model was used to verify what happens when the wind velocity outside changes. To gauge the flow in the second method was more difficult because a new program was used, named TRNFLOW. This program is a function of TRNSYS17 and in this program we created a model that was similar with our re-scaled model. The results of the TRNSYS-program and the results of the measurements were similar. As expected, when the chimney increased in height, the heat flow also increased through the chimney. The opening of the windows didn't have a big effect on the results, but when the opening was very small the TRNSYS-program couldn't make a good simulation. What was very noticeable is that when the wind velocity outside increased above 3 m/s, the velocity through the longest chimney changed directly to the lowest flow, so the flow through the chimney is very sensitive for the wind velocity outside. Finally, for verifying the results, different methods were used: a mass balance was set up and the theory of Thermal Buoyancy was applied.Degraeuwe, J.; Vandendorpe, B. (2013). Measuring the effect of the heat flow through a chimney with different altitudes of a rescaled model and validating the results with TRNSYS, a computer-simulated model. http://hdl.handle.net/10251/34198.Archivo delegad

    Towards a more refined insight in the critical motivating features of choice : an experimental study among recreational rope skippers

    Get PDF
    Objective: The question whether choice is a motivation and engagement-enhancing practice is a much debated subject, both theoretically as well as in practice. Therefore, the present study examined the impact of different types of choice on engagement and intended perseverance. Design: and method: In a sample of Belgian rope skippers (n = 159; M-age = 17.17; SDage = 8.43) an experimental field design was implemented, in which three different choice conditions were compared to a no-choice comparison group. Results: Results indicated that being offered choice with regard the type of exercises (i.e. option choice) were mixed, with choice yielding a clear engagement and perseverance-enhancing effect compared to a no choice control group in cases the offered options differed clearly from one another (i.e., high contrast option choice), while no benefits were observed in case choice options leaned closely to one another (i.e. low contrast option choice). Athletes' involvement in the order of exercises during a training session (i.e. action choice) tended to enhance athletes' engagement, but not their intentional perseverance, compared to a no choice control group. Finally, all experimentally offered choices yielded a positive effect on two aspects of autonomy need satisfaction, that is, perceived choice and felt volition. These two variables functioned as a chain of mechanisms through which different types of choice related to athlete engagement and intended perseverance. These effects emerged irrespective of rope-skippers' dispositional indecisiveness. Conclusion: The discussion highlights the importance of a nuanced discussion regarding the topic of choice, thereby contrasting the different pros and cons associated with each type of choice

    Thematic vocabulary selection for didactic purposes: evaluation of a quantitative approach

    Get PDF
    [ES] El presente estudio tiene por objetivo evaluar los resultados de un acercamiento cuantitativo a la selección temática del vocabulario con fines didácticos. Describimos en detalle cómo se configuran y se combinan tres medidas cuantitativas (la frecuencia absoluta, el keyness y la dispersión) a fin de automatizar la selección del vocabulario específico de un corpus especializado. A continuación evaluamos si la selección automática se ve confirmada por el juicio de profesores ELE. Hemos podido comprobar, en efecto, que en más del 85% de los casos el resultado del método cuantitativo es confirmado por al menos la mitad de los profesores. Esta observación también se evidencia estadísticamente, con un test de interrater reliability que demuestra un acuerdo sustancial (Cohen’s kappa = 0,61) entre el juicio de los profesores y la selección automática.[EN] The aim of this study is to evaluate the results of a quantitative approach to the thematic selection of vocabulary for didactic purposes. We describe in detail how three quantitative measures (absolute frequency, keyness and dispersion) are configured and combined to automate the selection of specific vocabulary from a specialized corpus. We then evaluate whether the automatic selection is confirmed by the judgements of SFL teachers. The results of this evaluation experiment show that in more than 85% of the cases the output of the quantitative selection method is accepted by at least half of the teachers. This observation is also backed from a statistical angle, with the outcome of an interrater reliability test indicating that there is a substantial agreement (Cohen’s kappa = 0.61) between the judgements of the teachers and the automatic selection.Degraeuwe, J.; Goethals, P. (2020). La selección temática del vocabulario para fines didácticos: evaluación de un acercamiento cuantitativo. Revista de Lingüística y Lenguas Aplicadas. 15(1):1-14. https://doi.org/10.4995/rlyla.2020.11969OJS114151Biber, D., Connor, U. y Upton, T. A. (2007). Discourse on the move: using corpus analysis to describe discourse structure. Ámsterdam: John Benjamins. https://doi.org/10.1075/scl.28Boulton, A. (2017). "Data-Driven Learning and Language Pedagogy", en S. L. Thorne & S. May (eds.), Language, Education and Technology, Encyclopedia of Language and Education. Berlín & Heidelberg: Springer International Publishing, 181-192. https://doi.org/10.1007/978-3-319-02237-6_15Bowker, L. y Pearson, J. (2002). Working with specialized language: a practical guide to using corpora. Londres & Nueva York: Routledge. https://doi.org/10.4324/9780203469255Buyse, K., Delbecque, N. y Speelman, D. (2004). Portavoces. Thematische woordenschat Spaans. Malinas: Wolters Plantyn.Davies, M. (2006). A frequency dictionary of Spanish: Core vocabulary for learners. Nueva York: Routledge. https://doi.org/10.4324/9780203415009Gabrielatos, C. y Marchi, A. (2011). "Keyness: Matching metrics to definitions" (Contribución presentada en the Corpus Linguistics in the South), Portsmouth, NH.García Salido, M. y Alonso Ramos, M. (2018). "Asignación de niveles de aprendizaje a las colocaciones del Diccionario de Colocaciones del español", Revista signos, 51/97, 153-174. https://doi.org/10.4067/S0718-09342018000200153Goethals, P. (2018). "Customizing vocabulary learning for advanced learners of Spanish", en T. Read, B. Sedano Cuevas y S. Montaner-Villalba (Eds.), Technological innovation for specialized linguistic domains (pp. 229- 240). Berlin: Éditions Universitaires Européennes.Goethals, P., Tezcan, A. y Degraeuwe, J. (2019). "Vocabulary selection for didactic purposes: report on a machine learning approach". Argentinian Journal of Applied Linguistics, 7/2, 34-51.Goethals, P., Lefever, E. y Macken, L. (2017). "SCAP_tur: Tagging and lemmatising Spanish tourism discourse, and beyond". Ibérica, 33, 279-288.Gries, S. T. (2008). "Dispersions and adjusted frequencies in corpora", International Journal of Corpus Linguistics, 13, 403-437. https://doi.org/10.1075/ijcl.13.4.02griIzquierdo Gil, M. d. C. (2005). La selección de léxico en la enseñanza del español como lengua extranjera. Su aplicación al nivel elemental en estudiantes francófonos. Málaga: ASELE Colección Monografías.Krippendorff, K. (2004). Content analysis: An introduction to its methodology. Sage, California: Thousand Oaks.Landis, J.R. y Koch, G.G. (1977). "The measurement of observer agreement for categorical data", Biometrics, 33, 159-174. https://doi.org/10.2307/2529310Laufer, B., Meara, P. y Nation, P. (2005). "Ten best ideas for teaching vocabulary", The Language Teacher, 29/7, 36.Nation, P. (2016). Making and Using Word Lists for Language Learning and Testing. John Benjamins. https://doi.org/10.1075/z.208Oakes, M. P. y Farrow, M. (2007). "Use of the chi-squared test to examine vocabulary differences in English-language corpora representing seven different countries", Literary and Linguistic Computing, 22/1, 85100. https://doi.org/10.1093/llc/fql044Okamoto, M. (2015). "Is corpus word frequency a good yardstick for selecting words to teach? Threshold levels for vocabulary selection", System, 51, 1-10. https://doi.org/10.1016/j.system.2015.03.004Schmitt, N. (2008). "Review article: Instructed second language vocabulary learning", Language Teaching Research, 12/3, 329-363. https://doi.org/10.1177/1362168808089921Scott, M. (1996). WordSmith Tools Manual. Oxford: Oxford University Press.Scott, M. (1997). "PC analysis of key words - and key key words", System, 25/2, 233-245. https://doi.org/10.1016/S0346-251X(97)00011-0Sinclair, J. (2005). "Corpus and texts - Basic principles", en M. Wynne (ed.) Developing linguistic corpora: a guide to good practice. Oxford & Oakville: Oxbow Books, 116.Vincze, O. (2015). "Learning multiword expressions from corpora and dictionaries" (tesis de doctorado), Universidade Da Coruña.Zijlstra, W.P., van der Ark, A. y Sijtsma, K. (2007). "Outlier Detection in Test and Questionnaire Data". Multivariate Behavioral Research, 42/3, 531-555. https://doi.org/10.1080/0027317070138434

    A novel approach to screen and compare emission inventories

    Get PDF
    A methodology is proposed to support the evaluation and comparison of different types of emission inventories, and more specifically the comparison of bottom-up versus top-down approaches. The strengths and weaknesses of the methodology are presented and discussed based on an example. The approach results in a “diamond” diagram useful to flag out anomalous behaviors in the emission inventories and to get insight on possible explanations. In particular, the “diamond” diagram is shown to provide meaningful information in terms of: discrepancies between the total emissions reported by macro-sector and pollutant, contribution of each macro-sector to the total amount of emissions released by pollutant, and the identification and quantification of the different factors causing the discrepancies between total emissions. Its main strength as an indicator is to allow investigating the relative contribution of activity and weighted emission factors. A practical example in Barcelona is used for testing and to provide relevant information for the analyzed emission datasets. The tests show the capability of the proposed methodology to flag inconsistencies in the existing inventories. The proposed methodology system may be useful for regional and urban inventory developers as an initial evaluation of the consistency of their inventories.JRC.H.2-Air and Climat
    corecore