5 research outputs found

    Pseudo-relatives complement of perception predicates

    Get PDF
    Pseudorelatives (PRs) are single constituents formed by a DP (the head) and an embedded clause headed by the complementizer que (1). The relation between the head and the embedded clause is a relation of predication. PRs do not display a restrictive reading but a situational one. (1) He visto a [PR Juan que corría] I.have seen a Juan that ran 'I saw Juan running' Previous literature on pseudorelatives contains different explanations regarding their internal structure, the way PRs relate to the matrix predicate, the position PRs can occupy within the matrix clause and the function the head of the PR has within the embedded clause. The goal of this thesis is to go in depth through these four aspects in the light of the following three new observations: i) Previous literature only considers the possibility of having subject-gap PRs (1) (the head of the PR is the subject of the embedded predicate). However, I propose the Object-gap PR generalization: object-gap PRs (2) (the head of the PR is either the direct or the indirect object of the emebdded predicate) are available in those languages allowing Object Clitic Doubling (Spanish, Greek). Those languages lacking Object clitic Doubling do not allow object-gap PRs (Italian, French or Portuguese). (2) a. He visto a Maríai que *( lai) traían en coche I.have seen a María that her-ACC brought.3.PL by car 'I saw María who was being brought by car' b. He visto a Pacoi que *( lei) pedían la hora unos chavales I.have seen a Paco that le-DAT asked.3PL the time some guys 'I saw Paco who was being asked the time by some guys' ii) The head of the PR needs to be animate. Animacy becomes a crucial factor for object-gap PRs since if the object-head of the PR is not animate, the situational reading is not obtained (3). (3) He visto el tren que lo ?? reparaban en cocheras/ llegaba a cocheras I.have seen the train that lo-ACC fixed-3.PL in sheds / arrived to sheds 'I have just seen the train being fixed up in the shed / arriving to the shed ' iii) PRs can only appear in complement position of the matrix predicate. Considering the consequences of these new observations, the previous control and raising analyses are discarded. A control analysis cannot account for objet-gap PRs because the controller can never control the direct object of the embedded predicate. The raising analysis is ruled out because it cannot explain the mandatory presence of object clitics within the embedded clause, the double case assignment of the head in subject-gap and indirect object-gap PRs or the motivation for the movement of the head to its superficial position. Thus, a dislocation analysis for PRs where the head of the PR is base-generated in the left periphery of the embedded clause is proposed to account for the availability of subject-gap and object-gap PRs and the presence of the clitics in the case of object-gap PRs and pro in the case of subject-gap PRs. Further research includes an explanation for those languages that do not allow for objectgap PRs (e.g. Italian) but allow clitic left dislocation structures, the concrete properties that allow perception predicates to select for PRs or the secondary predication character of PRs

    Pseudo-relatives complement of perception predicates

    Get PDF
    Pseudorelatives (PRs) are single constituents formed by a DP (the head) and an embedded clause headed by the complementizer que (1). The relation between the head and the embedded clause is a relation of predication. PRs do not display a restrictive reading but a situational one. (1) He visto a [PR Juan que corría] I.have seen a Juan that ran 'I saw Juan running' Previous literature on pseudorelatives contains different explanations regarding their internal structure, the way PRs relate to the matrix predicate, the position PRs can occupy within the matrix clause and the function the head of the PR has within the embedded clause. The goal of this thesis is to go in depth through these four aspects in the light of the following three new observations: i) Previous literature only considers the possibility of having subject-gap PRs (1) (the head of the PR is the subject of the embedded predicate). However, I propose the Object-gap PR generalization: object-gap PRs (2) (the head of the PR is either the direct or the indirect object of the emebdded predicate) are available in those languages allowing Object Clitic Doubling (Spanish, Greek). Those languages lacking Object clitic Doubling do not allow object-gap PRs (Italian, French or Portuguese). (2) a. He visto a Maríai que *( lai) traían en coche I.have seen a María that her-ACC brought.3.PL by car 'I saw María who was being brought by car' b. He visto a Pacoi que *( lei) pedían la hora unos chavales I.have seen a Paco that le-DAT asked.3PL the time some guys 'I saw Paco who was being asked the time by some guys' ii) The head of the PR needs to be animate. Animacy becomes a crucial factor for object-gap PRs since if the object-head of the PR is not animate, the situational reading is not obtained (3). (3) He visto el tren que lo ?? reparaban en cocheras/ llegaba a cocheras I.have seen the train that lo-ACC fixed-3.PL in sheds / arrived to sheds 'I have just seen the train being fixed up in the shed / arriving to the shed ' iii) PRs can only appear in complement position of the matrix predicate. Considering the consequences of these new observations, the previous control and raising analyses are discarded. A control analysis cannot account for objet-gap PRs because the controller can never control the direct object of the embedded predicate. The raising analysis is ruled out because it cannot explain the mandatory presence of object clitics within the embedded clause, the double case assignment of the head in subject-gap and indirect object-gap PRs or the motivation for the movement of the head to its superficial position. Thus, a dislocation analysis for PRs where the head of the PR is base-generated in the left periphery of the embedded clause is proposed to account for the availability of subject-gap and object-gap PRs and the presence of the clitics in the case of object-gap PRs and pro in the case of subject-gap PRs. Further research includes an explanation for those languages that do not allow for objectgap PRs (e.g. Italian) but allow clitic left dislocation structures, the concrete properties that allow perception predicates to select for PRs or the secondary predication character of PRs

    The object-gap pseudorelative generalization

    No full text
    Previous literature contains two different points of view regarding the subject-object asymmetry related to the DP head of pseudorelatives (PRs). Some authors claim that the DP head can only be interpreted as the subject of the embedded predicate (subject-gap PRs). Other authors point towards the possibility of finding other constituents (e.g. direct object) in head position (object-gap PR), too. In this paper I claim that there are certain languages that only allow the DP head to be the subject of the embedded predicate, that is, they only allow subject-gap PRs, whereas other languages allow both subject-gap and object-gap PRs. Thus, the aim of this paper is to present the object-gap pseudorelative (PR) generalization to account for the cross-linguistic availability of subject-gap and object-gap PRs: the availability of object-gap PRs is subject to object clitic doubling. The structure of this paper goes as follows. Section 1 introduces PRs. Section 2 presents data about subject-gap and object-gap PRs. Section 3 gives some remarks on object clitic doubling. Section 4 presents the object-gap PR generalization. Conclusions and further research issues are presented in section 5

    Desambiguación de construcciones con se con aprendizaje automático

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Facultad de Filosofía y Letras, Departamento de Lingüística, Lenguas Modernas, Lógica y Fª de la Ciencia y Tª de la Literatura y Literatura Comparada. Fecha de lectura: 10-12-2021Spanish se constructions constitute a linguistic phenomenon that challenges Natural Language Processing (NLP) tasks such as part-of-speech or dependency relation tagging. The three main reasons why se is a hurdling topic for NLP are: rst, the high-frequency of appearance of se in Spanish; second, the nine di erent syntactic constructions where se appears adding information of diverse nature depending on the context; third, the lack of gender and number features se displays that does not help se-type disambiguation. This thesis' main goal is to improve the state-of-the-art results on automatic morphosyntactic se analysis on the basis of two hypotheses: the grouping (GH) and the subcategorization frame (SFH) hypotheses. This thesis proposes a new annotation scheme for se that connects the di erent constructions through a transitivity gradient (Moreno Cabrera, 2004). The new annotation scheme is applied on the SE-corpus, a European Spanish corpus made of 3,100 sentences containing the word se. The SE-corpus belongs to the news, leisure and daily life domain of CORPES XXI (Real Academia Espa~nola, 2018) and it has been manually annotated as part of this research work. The SE-corpus is used to train di erent models using UDPipe1.2 to test whether the new annotation scheme can be learnt by the neural networks that underlie the dependency parser. The resulting models are evaluated on an additional gold standard test corpus made of 100 sentences containing the form se. These sentences are obtained from CORPES XXI, too. The best model yields a LAS F-score of 86.97 points and a UAS F-score of 89.65 points. Regarding se analysis, the best model yields a LAS F-score of 82.55 points and a UAS F-score of 98.16 points. The main contributions of this thesis are: a new annotation scheme for se adapted to Universal Dependencies' guidelines, manual annotation guidelines for Spanish se disambiguation, the raw and annotated version of the SE-corpus and the best resulting mode

    Clasificación de construcciones con se en español: de modelos de bolsa de palabras a modelos de lenguaje

    Get PDF
    Spanish se constructions are a complex linguistic phenomenon that challenges Natural Language Processing (NLP) tasks such as part-of-speech or dependency relation tagging. Se is a high-frequency word that appears in nine different types of syntactic constructions and adds information of diverse nature depending on the context. Thus, to solve the problem Spanish se constructions poses in an efficient way, this study proposes a tagging system for se applied to a corpus composed of 2,140 sentences. This corpus is used in a classification experiment where 9 classifiers based on machine learning models and a dependency parser are tested. Results show that pre-trained language models based on transformers architecture reach the highest accuracy (0.83) and f-score (0.70) values.Las construcciones con se en español son un complejo fenómeno lingüístico que desafía tareas de Procesamiento del Lenguaje Natural (PLN) como el etiquetado automático de categoría gramatical (POS tagging) o de relaciones de dependencias. Se es una forma de alta frecuencia que aparece en nueve tipos de construcciones sintácticas del español, aportando información de diferente naturaleza en función del contexto. Por ello, para tratar el problema de clasificación que plantean las construcciones con se de manera eficiente, este estudio propone un sistema de etiquetado de se aplicado a un corpus de 2.140 oraciones y probado con 9 clasificadores basados en modelos de aprendizaje automático y un parser de dependencias. Los resultados muestran que los modelos pre-entrenados basados en arquitectura de transformers alcanzan los valores más elevados de exactitud (0,83) y de F-score (0,70).The authors acknowledge financial support from PID2019-106827GB-I00 / AEI / 10.13039/501100011033 and from the European Regional Development Fund and from the Spanish Ministry of Economy, Industry, and Competitiveness - State Research Agency, project TIN2016-76406-P (AEI/FEDER, UE)
    corecore