Search CORE

2 research outputs found

Análise e representacão de construcões adjectivais para processamento automático de texto. Adjectivos intransitivos humanos

Author: Carvalho Paula Cristina
Publication venue: HAL CCSD
Publication date: 01/12/2007
Field of study

This dissertation focus on the analysis and formalization of the lexico-syntactic properties of intransitive adjectives in contemporary European Portuguese. These adjectives are characterized as occurring with a human subject and having no complements. One of the underlying motivations for choosing this subject is the apparent lack of descriptive economy resulting from the double classification of many lexical entries as both nouns and adjectives. A substantial number of these adjectives has been classified as nouns, as a way of considering the cases in which they appear in typical nominal syntactic positions. This ambiguity finds resonance in the lexical phenomenon traditionally known as improper derivation, or conversion. In this study, we argue that some human adjectives can superficially fill the syntactic slot of head of a noun phrase. This analysis is based on the fact that in those syntactic constructions, the adjectives generally maintain some of the properties that they would have if they were in an adnominal context, and that it is always possible to reconstruct the human noun to which the adjective is related. Among several constructions studied here, we focus on: (i) characterizing indefinite constructions, where the adjective appears after an indefinite article; (ii) cross-constructions, where the adjective fills the head of a noun phrase; (iii) exclamatives expressing insult; and others, whose syntactic-semantic and discursive details we also try to clarify. The research is based on the analysis of 4,250 adjectival lemmas, which are organized in several syntacticsemantic subclasses according to the Lexicon-Grammar theoretical and methodological principles, established in the Harrisian framework of transformational operator grammar. All linguistic information was formalized in lexicon-grammar matrices which, as we illustrate, can be explored in several NLP tasks, namely in disambiguation and automatic text analysis.Este estudo teve como objectivo determinar e formalizar as propriedades léxico-sintácticas dos adjectivos intransitivos, i.e., sem complementos, e que se constroem com sujeito humano, em português europeu contemporâneo. Uma das motivações subjacentes à escolha deste tema foi a aparente falta de economia descritiva resultante da dupla classificação de numerosas unidades lexicais como nomes e adjectivos. Efectivamente, muitos destes adjectivos têm sido classificados como nomes, por forma a dar conta dos casos em que aparecem em posições sintácticas tipicamente nominais. Esta ambiguidade encontra eco no fenómeno de criação lexical tradicionalmente designado como derivação imprópria (ou conversão). Nesta dissertação, defendemos que certos adjectivos humanos têm a propriedade de desempenhar superficialmente a função de núcleo de grupos nominais. Esta análise baseia-se na constatação de que, nessas construções sintácticas, os adjectivos exibem, geralmente, algumas propriedades que exibiriam se se encontrassem em contexto adnominal e de que é possível reconstituir o nome (humano) a que os mesmos se encontram associados. Entre as várias estruturas aqui analisadas, tratámos (i) as construções caracterizadoras indefinidas, em que o adjectivo aparece precedido de artigo indefinido; (ii) as construções cruzadas, em que o adjectivo ocupa a posição típica de núcleo de um grupo nominal; (iii) as orações exclamativas de insulto; e outras, cujas especificidades sintácticas, semânticas e discursivas procurámos igualmente clarificar. A investigação baseou-se na análise de 4.250 lemas adjectivais, que organizámos em diversas subclasses sintáctico-semânticas, de acordo com os princípios teórico-metodológicos do Léxico-Gramática, fundados na gramática transformacional de operadores harrissiana. As informações linguísticas foram formalizadas em matrizes léxico-sintácticas, o que permite, como ilustraremos, a sua utilização em diversas tarefas de processamento de linguagem natural (PLN), nomeadamente, na desambiguação e análise sintáctica automática de textos

Thèses en Ligne

Expressões proverbiais do português - usos, variação formal e identificação automática

Author: Reis Sónia Margarida Moreira
Publication venue
Publication date: 12/10/2020
Field of study

Os provérbios são uma expressão da cultura de uma sociedade e estão ligados às mais diversas áreas da experiência humana. Este tipo de expressões surge nos mais variados tipos de texto e desempenha diferentes funções retóricas no discurso, nele se integrando por meio de diferentes mecanismos, nem sempre fáceis de detetar formalmente. A interferência nos processos de coerência e de coesão discursivas, p. ex., a referência, e a sua variação formal, constituem sérios desafios ao processamento da linguagem natural (PLN), exigindo a sua identificação e delimitação precisas. Este projeto visa a identificação automática de provérbios portugueses (e as suas variantes) em textos, a fim de melhor caracterizar a sua utilização, tanto qualitativa como quantitativamente. Tal permitirá a definição de índices de frequência e, a partir destes, bem como de outros critérios, a determinação da disponibilidade lexical das unidades paremiológicas (o provérbio e as suas variantes). Estas informações são elementos relevantes a considerar, por exemplo, no desenvolvimento de instrumentos complementares de diagnóstico ou terapia de determinadas patologias da linguagem; ou mesmo para a construção de jogos didáticos para o ensino-aprendizagem de português, língua materna e língua não materna, eventualmente assistido por computador.Proverbs are an expression of the culture of a society and are connected to the most diverse areas of human experience. This type of expression occurs in the most diverse types of text and performs different rhetorical functions in the discourse, integrating it through several mechanisms, not always easy to detect formally. Interference in the processes of discursive coherence and cohesion, e.g. reference, and its formal variation, constitute serious challenges to Natural Language Processing (NLP), requiring its precise identification and delimitation. This project aims at the automatic identification of Portuguese proverbs (and their variants) in texts, to effectively characterize their use, both qualitatively and quantitatively. This will allow the definition of frequency indices and, from these, as well as other criteria, the determination of the lexical availability of the paremiological units (the proverb and its variants). This information is relevant elements to consider, for example, in the development of complementary diagnostic or therapeutic instruments for certain language pathologies; or even for the construction of didactic games for the teaching-learning of Portuguese, either as a Mother Tongue or as a Foreign Language, eventually assisted by computer

Thèses en Ligne

HAL Descartes

Sapientia