Search CORE

297 research outputs found

Tuning Document Based Hierarchies with Generative Principles

Author: Vossen P.J.T.M.
Publication venue: Universite de Geneve
Publication date: 01/01/2001
Field of study

An exploration of the relatedness problem between arguments: combining the generative lexicon with inference

Author: Saint Dizier Patrick
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceGiven a controversial issue, argument mining from natural language texts is extremely challenging: domain knowledge is often required together with appropriate forms of inferences. This contribution explores the use of the Generative Lexicon viewed as both a lexicon and a domain knowledge representation

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Web 2.0, language resources and standards to automatically build a multilingual named entity lexicon

Author: Ferrández Sergio
Monachini Monica
Muñoz Rafael
Toral Antonio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/06/2011
Field of study

This paper proposes to advance in the current state-of-the-art of automatic Language Resource (LR) building by taking into consideration three elements: (i) the knowledge available in existing LRs, (ii) the vast amount of information available from the collaborative paradigm that has emerged from the Web 2.0 and (iii) the use of standards to improve interoperability. We present a case study in which a set of LRs for diﬀerent languages (WordNet for English and Spanish and Parole-Simple-Clips for Italian) are extended with Named Entities (NE) by exploiting Wikipedia and the aforementioned LRs. The practical result is a multilingual NE lexicon connected to these LRs and to two ontologies: SUMO and SIMPLE. Furthermore, the paper addresses an important problem which aﬀects the Computational Linguistics area in the present, interoperability, by making use of the ISO LMF standard to encode this lexicon. The diﬀerent steps of the procedure (mapping, disambiguation, extraction, NE identiﬁcation and postprocessing) are comprehensively explained and evaluated. The resulting resource contains 974,567, 137,583 and 125,806 NEs for English, Spanish and Italian respectively. Finally, in order to check the usefulness of the constructed resource, we apply it into a state-of-the-art Question Answering system and evaluate its impact; the NE lexicon improves the system’s accuracy by 28.1%. Compared to previous approaches to build NE repositories, the current proposal represents a step forward in terms of automation, language independence, amount of NEs acquired and richness of the information represented

DCU Online Research Access Service

Learning Ontology Relations by Combining Corpus-Based Techniques and Reasoning on Data from Semantic Web Sources

Author: Wohlgenannt Gerhard
Publication venue: 'Peter Lang, International Academic Publishers'
Publication date: 01/04/2020
Field of study

The manual construction of formal domain conceptualizations (ontologies) is labor-intensive. Ontology learning, by contrast, provides (semi-)automatic ontology generation from input data such as domain text. This thesis proposes a novel approach for learning labels of non-taxonomic ontology relations. It combines corpus-based techniques with reasoning on Semantic Web data. Corpus-based methods apply vector space similarity of verbs co-occurring with labeled and unlabeled relations to calculate relation label suggestions from a set of candidates. A meta ontology in combination with Semantic Web sources such as DBpedia and OpenCyc allows reasoning to improve the suggested labels. An extensive formal evaluation demonstrates the superior accuracy of the presented hybrid approach

Directory of Open Access Books (DOAB)

Linking and Integrating two Electronic Lexicons

Author: Marinelli Rita
Roventini Adriana
Ruimy Nilda
Ulivieri Marisa
Publication venue: Vassar, USA
Publication date
Field of study

Lexicography, much attention is being paid, when building lexical resources, to their interoperability and their easy integration in HLT-NLP applications for an enhanced performance. Concerning already existing computational lexicons, on the other hand, their integration and interoperability is attainable, provided their main features offer a field of comparison. The two largest and extensively encoded electronic lexicons of Italian language fulfill this essential requirement. Although developed according to two different lexical models, ItalWordNet and PAROLE-SIMPLE-CLIPS present in fact many compatible aspects. Linking and eventually merging these lexical resources in a common representation framework seems therefore a wise move to offer the end-user a more exhaustive and in-depth lexical information combining the potentialities and most outstanding features offered by the two lexical models. This paper reports on the ongoing linking of the two lexicons. The mapping of the ontologies on which basis the lexicons are structured is described; an overview of the adopted methodology, of the linking process and of the results of the first mapping phase regarding 1stOrder Entities is provided. Reciprocal benefits and enhancements for the two resources are also illustrated that definitely justify the soundness of our linking initiative

PUblication MAnagement

On link predictions in complex networks with an application to ontologies and semantics

Author: Entrup Bastian
Publication venue: FB 05 - Sprache, Literatur, Kultur. Germanistik
Publication date: 01/01/2016
Field of study

It is assumed that ontologies can be represented and treated as networks and that these networks show properties of so-called complex networks. Just like ontologies our current pictures of many networks are substantially incomplete (Clauset et al., 2008, p. 3ff.). For this reason, networks have been analyzed and methods for identifying missing edges have been proposed. The goal of this thesis is to show how treating and understanding an ontology as a network can be used to extend and improve existing ontologies, and how measures from graph theory and techniques developed in social network analysis and other complex networks in recent years can be applied to semantic networks in the form of ontologies. Given a large enough amount of data, here data organized according to an ontology, and the relations defined in the ontology, the goal is to find patterns that help reveal implicitly given information in an ontology. The approach does not, unlike reasoning and methods of inference, rely on predefined patterns of relations, but it is meant to identify patterns of relations or of other structural information taken from the ontology graph, to calculate probabilities of yet unknown relations between entities. The methods adopted from network theory and social sciences presented in this thesis are expected to reduce the work and time necessary to build an ontology considerably by automating it. They are believed to be applicable to any ontology and can be used in either supervised or unsupervised fashion to automatically identify missing relations, add new information, and thereby enlarge the data set and increase the information explicitly available in an ontology. As seen in the IBM Watson example, different knowledge bases are applied in NLP tasks. An ontology like WordNet contains lexical and semantic knowl- edge on lexemes while general knowledge ontologies like Freebase and DBpedia contain information on entities of the non-linguistic world. In this thesis, examples from both kinds of ontologies are used: WordNet and DBpedia. WordNet is a manually crafted resource that establishes a network of representations of word senses, connected to the word forms used to express these, and connect these senses and forms with lexical and semantic relations in a machine-readable form. As will be shown, although a lot of work has been put into WordNet, it can still be improved. While it already contains many lexical and semantical relations, it is not possible to distinguish between polysemous and homonymous words. As will be explained later, this can be useful for NLP problems regarding word sense disambiguation and hence QA. Using graph- and network-based centrality and path measures, the goal is to train a machine learning model that is able to identify new, missing relations in the ontology and assign this new relation to the whole data set (i.e., WordNet). The approach presented here will be based on a deep analysis of the ontology and the network structure it exposes. Using different measures from graph theory as features and a set of manually created examples, a so-called training set, a supervised machine learning approach will be presented and evaluated that will show what the benefit of interpreting an ontology as a network is compared to other approaches that do not take the network structure into account. DBpedia is an ontology derived from Wikipedia. The structured information given in Wikipedia infoboxes is parsed and relations according to an underlying ontology are extracted. Unlike Wikipedia, it only contains the small amount of structured information (e.g., the infoboxes of each page) and not the large amount of unstructured information (i.e., the free text) of Wikipedia pages. Hence DBpedia is missing a large number of possible relations that are described in Wikipedia. Also compared to Freebase, an ontology used and maintained by Google, DBpedia is quite incomplete. This, and the fact that Wikipedia is expected to be usable to compare possible results to, makes DBpedia a good subject of investigation. The approach used to extend DBpedia presented in this thesis will be based on a thorough analysis of the network structure and the assumed evolution of the network, which will point to the locations of the network where information is most likely to be missing. Since the structure of the ontology and the resulting network is assumed to reveal patterns that are connected to certain relations defined in the ontology, these patterns can be used to identify what kind of relation is missing between two entities of the ontology. This will be done using unsupervised methods from the field of data mining and machine learning

Giessener Elektronische Bibliothek

Natural Language-based Approach for Helping in the Reuse of Ontology Design Patterns

Author: Aguado de Cea G.
Gómez-Pérez A.
Montiel-Ponsoda Elena
Suárez-Figueroa Mari Carmen
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2008
Field of study

Experiments in the reuse of Ontology Design Patterns (ODPs) have revealed that users with different levels of expertise in ontology modelling face difficulties when reusing ODPs. With the aim of tackling this problem we propose a method and a tool for supporting a semi-automatic reuse of ODPs that takes as input formulations in natural language (NL) of the domain aspect to be modelled, and obtains as output a set of ODPs for solving the initial ontological needs. The correspondence between ODPs and NL formulations is done through Lexico-Syntactic Patterns, linguistic constructs that convey the semantic relations present in ODPs, and which constitute the main contribution of this paper. The main benefit of the proposed approach is the use of non-restricted NL formulations in various languages for obtaining ODPs. The use of full NL poses challenges in the disambiguation of linguistic expressions that we expect to solve with user interaction, among other strategies

CiteSeerX

Archivo Digital UPM

Computation of verbal predicates in portuguese : relational network, lexical-conceptual structure and context : the case of verbs of movement

Author: Amaro Raquel
Publication venue
Publication date: 01/01/2009
Field of study

Tese de doutoramento, Linguística (Linguística Computacional), 2010, Universidade de Lisboa, Faculdade de LetrasInserida no campo da Semântica Lexical Computacional, e com base no pressuposto de que o desempenho de processos computacionais de determinação do significado beneficia grandemente do uso de recursos lexicais extensos e estruturados, esta dissertação apresenta uma análise de verbos de movimento do Português, com o objectivo de determinar as propriedades semânticas e sintácticas destes itens lexicais e a forma como esta informação se relaciona com a computação e previsão das estruturas em que estes verbos podem ocorrer. A restrição do objecto de estudo a um domínio semântico específico permitiu uma determinação mais precisa do significado de cada verbo, através do estabelecimento de relações léxico-conceptuais num modelo relacional do Léxico. A análise da semântica lexical destes verbos tem como base as especificidades de significado que diferenciam os verbos hipónimos dos seus hiperónimos e dos seus nós irmãos. A identificação de componentes do significado partilhados e não partilhados por verbos de um mesmo domínio semântico motiva a definição da informação semântica relevante a representar ao nível da entrada lexical, bem como a determinação da estrutura desta informação. No âmbito deste trabalho, é apresentada uma proposta de wordnet de verbos de movimento, referindo os diferentes níveis de análise relevantes para uma representação coerente dos verbos desta classe: a forma como os itens lexicais são agrupados em conjuntos de sinónimos que denotam conceitos e as relações estabelecidas entre estes conjuntos contemplam as propriedades conceptuais e semânticas dos itens lexicais, e a organização do léxico daí resultante permite determinar qual informação partilhada. A construção de uma wordnet de verbos de movimento do Português impôs a definição dos nós de topo da rede, bem como a determinação de outras opções de codificação, permitindo testar a herança conceptual pelos nós mais baixos da hierarquia. A rede obtida revelou a diversidade semântica e sintáctica de verbos directamente relacionados, e, particularmente, que propriedades semânticas, tais como a estrutura argumental ou propriedades de Aktionsart, estão directamente relacionadas com a especificação dos conceitos denotados, mas não são directamente herdadas ou condicionadas pelo domínio semântico a que um dado verbo pertence. Com base na wordnet desenvolvida, é apresentada uma análise decomposicional do significado dos verbos de movimento do Português, evidenciando as especificidades de significado que diferenciam os nós hipónimos dos seus hiperónimos. Esta análise revelou padrões de incorporação semântica diferentes dos descritos por Talmy (1985) para as línguas românicas, e resultou na proposta de um novo conjunto de componentes semânticos, lexicalizados nos verbos estudados, mas extensível à análise de verbos de outros domínios semânticos. O conteúdo semântico específico de cada verbo hipónimo diferencia verbos co-hipónimos e explica a incompatibilidade entre co-hipónimos: são incompatíveis (i.e., não co-ocorrem) cohipónimos que lexicalizam valores opostos, ou de outro modo incompatíveis,de um mesmo componente semântico. A lexicalização dos componentes semânticos considerados afecta em vários graus a herança de propriedades do hiperónimo, nomeadamente no que respeita a propriedades relativas à estrutura argumental (número de argumentos, propriedades de subcategorização e restrições semânticas do tipo de argumentos seleccionados) e a propriedades de Aktionsart. Foram observados os seguintes padrões de lexicalização: a incorporação de restrições relativas aos componentes semânticos ORIGEM (local ou posição inicial) e DESTINO (local ou posição final) resulta no aumento do número de argumentos seleccionados sintacticamente realizados, ao passo que a lexicalização destes componentes resulta na diminuição do número de argumentos sintacticamente realizados, comparativamente com a estrutura argumental do hiperónimo. A lexicalização de TRAJECTO (localizações intermédias entre a ORIGEM e o DESTINO) resulta no acréscimo de mais um argumento, relativamente à estrutura argumental do verbo hiperónimo, tipicamente correspondendo a um argumento que denota OBJECTO DE REFERÊNCIA (objecto externo relativamente ao qual o evento é perspectivado), realizado sintacticamente na posição de objecto; a incorporação de restrições a este componente semântico (TRAJECTO) resulta no aumento do número de argumentos seleccionados sintacticamente realizados e reflecte-se na selecção de um argumento sintacticamente realizado, denotador de TRAJECTO do evento de movimento, introduzido pela preposição por. As alterações de propriedades de Aktionsart na wordnet de verbos de movimento do Português, i.e., hipónimos com valores de Aktionsart diferentes dos dos seus hiperónimos, ocorrem com a lexicalização de DESTINO e ORIGEM. A lexicalização destes componentes resulta em eventos de tipo accomplishment ou achievement, dado que a definição da localização ou posição final (DESTINO) ou da localização ou posição inicial (ORIGEM) estabelece um limite ao evento, transformando um evento de tipo actividade num evento de tipo accomplishment ou achievement. A representação dos itens lexicais aqui proposta é feita no quadro do Léxico Generativo (LG) e contempla três níveis de representação distintos: a estrutura argumental, a estrutura eventiva e a estrutura qualia. Os itens lexicais estão, por sua vez, integrados numa estrutura de herança lexical. De forma a conseguir uma caracterização mais completa dos verbos de movimento do Português, especificamente no que diz respeito às suas propriedades de subcategorização, é proposta a modelização de preposições na WordNet.PT (WN.PT) e a sua representação lexical no quadro do LG. A integração das preposições na WN.PT segue investigação existente sobre modelos ontológicos de representação de preposições, nomeadamente no que toca aos conceitos denotados por estes itens lexicais, consensualmente adoptados quer pelas gramáticas tradicionais, quer análises linguísticas actuais. Esta integração resulta num tratamento coerente e uniforme de preposições semanticamente plenas, que introduzem argumentos verbais, mas também de preposições marcadoras de argumento. Através da utilização dos níveis e elementos de representação do LG, é proposta a representação integral de verbos de movimento do Português, dando conta da percolação de informação no léxico, do impacto da lexicalização de componentes semânticos nas propriedades semânticas e sintácticas dos verbos e da compatibilidade entre co-hipónimos. A utilização recursiva das estruturas lexicais disponíveis permite a percolação da informação através das redes de hiperonímia e possibilita uma codificação coerente e económica da informação, incluindo propriedades de subcategorização significativas. As estruturas lexicais resultantes mostram como a relação de hiponímia pode substituir redes ortogonais de tipos, no que respeita ao estabelecimento e à definição das propriedades semânticas através de estratégias de subtipificação. Além disso, a permeabilidade ao contexto de que dão conta os mecanismos generativos integrados no LG, em particular os mecanismos de subespecificação e de co-composição, assegura a plasticidade que explica a diversidade de comportamentos sintácticos dos itens lexicais, directamente relacionada com as suas propriedades léxicosemânticas. Para a definição de um léxico computacional que modelize as propriedades semânticas e sintácticas dos itens lexicais é proposta a integração das estruturas informacionais do LG nas wordnets: as estruturas informacionais do LG permitem entradas lexicais estruturadas e o modelo da WordNet, pela sua natureza, fornece a necessária hierarquia lexical que permite o acesso a outras estruturas no léxico. A integração dos níveis de representação do LG, nomeadamente da estrutura argumental, da estrutura qualia e da estrutura eventiva, prova que as wordnets podem comportar descrições lexicaisde maior granularidade, que suportam o tratamento de vários fenómenos léxico-conceptuais, sem comprometer a sua arquitectura. A integração de informação relativa à estrutura argumental na WN.PT é conseguida através da implementação de três novas relações: a relação SELECCIONA/É SELECCIONADO POR; a relação INCORPORA/É INCORPORADO POR e a relação SELECCIONA POR DEFEITO/É SELECCIONADO POR DEFEITO POR. A integração da estrutura qualia é obtida pela associação de relações léxico-conceptuais aos papéis qualia, sem qualquer perda de informação, no que constitui um processo simples e económico. A expressão da estrutura eventiva no modelo da WordNet, por sua vez, é alcançada através de um novo conjunto de traços (Tipo de evento, Argumentos, Subeventos, Restrições e Núcleo) que permite a associação das propriedades internas dos eventos aos synsets e a sua codificação na base de dados. A representação sistemática de informação relativa à estrutura eventiva, para além de permitir a descrição da ordem dos argumentos, enriquece o poder descritivo destes recursos. A integração dos níveis de representação do LG em wordnets tem como resultado repositórios de informação semântica lexical mais ricos e estruturados que contemplam informação relativa aos papéis qualia e que permitem a extracção de informação relativa às estruturas argumentais e eventiva dos itens lexicais, ou seja, léxicos generativos sobre os quais podem operar mecanismos como a co-composição, a ligação selectiva e a coerção de tipos. As propriedades semânticas e sintácticas consideradas nas entradas lexicais dos verbos analisados fornecem também pistas para dar conta de restrições de ocorrência destes verbos em algumas construções. Dando particular atenção à selecção de argumentos denotadores de local e de OBJECTO DE REFERÊNCIA, realizados sintacticamente na posição de objecto, à expressão de movimento direccionado em Português, à ocorrência de verbos de movimento em construções médias e não-causativas e à distribuição do clítico –se nestas construções, este trabalho apresenta também a análise dos diferentes comportamentos linguísticos dos verbos de movimento do Português nestes contextos e a relação destes comportamentos com as propriedades léxico-semânticas dos verbos. Apesar de não permitir um tratamento exaustivo de todos os comportamentos observados, a caracterização léxico-semântica proposta neste trabalho constitui um passo necessário para permitir o tratamento dos fenómenos observados, avançando algumas explicações que permitem dar conta destes diferentes comportamentos. Verbos que lexicalizam ORIGEM e DESTINO ou TRAJECTO seleccionam objectos que denotam OBJECTO DE REFERÊNCIA, i.e., argumentos verdadeiros que denotam entidades concretas e delimitadas, expressos sintacticamente por SNs. A possibilidade de ocorrer em estruturas de movimento direccionado, i.e., com SPs que expressam a ORIGEM e o DESTINO do movimento, está directamente relacionada com as propriedades semânticas e sintácticas dos verbos analisados: verbos de mudança de localização legitimam e/ou restringem a ocorrência destes constituintes, de acordo com os componentes semânticos lexicalizados e com as suas propriedades de subcategorização. Ainda no que diz respeito à expressão de movimento direccionado em Português, os dados analisados mostram que a distribuição dos verbos de movimento do Português com SPs denotadores de DESTINO introduzidos pela preposição a é condicionada pelo tipo de evento de movimento denotado pelo verbo (modo de movimento vs. movimento direccionado), mas também pelas propriedades de Aktionsart dos verbos, uma vez que os SPs introduzidos por a induzem uma interpretação pontual do estado final do evento, refutando assim as análises de verbos de movimento nas línguas românicas baseadas apenas nas restricções de ocorrência destes verbos com esta preposição. A correlação entre a proeminência de uma causa externa ou agente e a impossibilidade da sua ocorrência em construções não causativas dá conta da distribuição dos verbos de movimento nestas construções: verbos que lexicalizam INTENÇÃO ou um componente de MODO forte que implique a acção de uma causa externa ou agente não entram em construções não causativas. A análise da distribuição do clítico –se em construções médias, não causativas e passivas levou levantou a hipótese de o clítico induzir uma interpretação de envolvimento de um actor externo no evento: as construções passivas pressupõem necessariamente uma causa externa, logo exigem o clítico; nas construções médias o clítico marca os casos em que há a pressuposição do envolvimento de um actor externo no evento; e nas construções não-causativas, o clítico marca a correlação entre o agente e o tema/paciente do evento, forçando uma leitura não-causativa com sujeitos sintácticos [-animados]. Neste trabalho, fica patente que a modelização dos itens lexicais de uma dada categoria gramatical não é independente da de itens de outras categorias com que estes podem ocorrer, o que, necessariamente, aumenta o escopo da nossa análise. Para além disso, fica demonstrado que a modelização dos itens lexicais no modelo da WordNet compreende uma estrutura de herança lexical motivada, permitindo uma descrição adequada e económica dos itens lexicais e potenciando a construção de recursos lexicais de grande escala para fins computacionais.Within the field of Computational Lexical Semantics, and based on the assumption that the performance of meaning determination computational processes is largely assisted by structured and extensive lexica, providing different types of information, this dissertation presents the analysis of Portuguese verbs of movement in order to determine the semantic and syntactic properties of these lexical items and how this information can be related to the computation and prediction of the structures in which they occur. The restriction to a specific semantic domain allowed a more accurate determination of the meaning of each verb, through the establishment of lexical-conceptual relations within a relational model of the Lexicon. The lexical semantic analysis of these verbs is based on the meaning specificities that differentiate hyponym verbs from their hyperonyms and sister nodes. The identification of the meaning components shared and those not shared by verbs of the same semantic domain motivates the determination of the relevant semantic information to be stated at the lexical entry level, as well as the structure of this information. This work puts forth a proposal for a Portuguese wordnet of verbs of movement, referring the different levels of analysis that are relevant for a coherent encoding of the verbs of this class: the way lexical items are grouped in concept denoting sets and the relations established between these sets contemplate the conceptual and semantic properties of the lexical items, and the resulting organization of the lexicon allows for the determining the information that is shared. The development of a wordnet for Portuguese verbs of movement required the definition of the top nodes of the net as well as of some other coding options, allowing testing conceptual inheritance from the higher to the lower nodes in the hierarchy. The resulting network revealed the semantic and syntactic diversity of verbs directly related, namely that semantic properties such as argument structure or Aktionsart properties are directly related to the meaning specificities of the concepts denoted, but are not straightforwardly inherited or conditioned by the semantic domain to which a given verb belongs. Based on the developed wordnet, a decompositional analysis of the meaning of the Portuguese verbs of movement is presented, focusing on the meaning specificities that differentiate each hyponym concept with regard to its hyperonym. This analysis revealed semantic incorporation patterns different from those considered to work for Romance languages and resulted in the proposal of a new set of semantic components, comprising the elements lexicalized by the verbs in study, and extendable to the analysis of verbs from other semantic domains. The semantic content specific to each hyponym differentiates co-hyponym verbs and explains co-hyponyms compatibility: co-hyponyms lexicalizing opposite or otherwise incompatible values for the same semantic element are incompatible (i.e., do not co-occur). The lexicalization of the semantic components considered affects the inheritance of the hyperonym properties at different degrees, namely in what concerns argument structure (argument number, subcategorization properties and semantic restrictions on the type of the arguments selected) and Aktionsart properties. The following salient patterns of lexicalization were observed: the incorporation of restrictions on the semantic components SOURCE (initial location or position) and GOAL (final location or position) results in an increase of the number of overt arguments of the hyponyms, whereas the lexicalization of these components results in a decrease of the number of overt arguments of the hyponyms, with respect to the hyperonym argument structure. The lexicalization of PATH (medium locations between the SOURCE and the GOAL) results in the increase of one more overt argument to the argument structure of the hyperonym verb, usually corresponding to a GROUND (external object with respect to which the event is put in perspective) argument realized in object position; the incorporation of restrictions on this semantic component results in the increase of the number of overt arguments, reflected in the selection of an overt argument referring the PATH of the movement event and is introduced by the preposition por (through). Aktionsart shifts within the wordnet of Portuguese verbs of movement, i.e., hyponyms that display Aktionsart values different from those of their hyperonyms, occur with the lexicalization of GOAL and SOURCE. The lexicalization of the elements SOURCE and GOAL result in accomplishment or achievement type events, since the determination of a specific final location or position (GOAL) or initial location or position (SOURCE) establishes a limit to the event, shifting an activity type event to an accomplishment or achievement type event. The lexical items representation is done within Generative Lexicon (GL) framework and contemplates three distinct levels – argument structure, event structure and qualia structure. Lexical items are integrated in a lexical inheritance structure. In order to better characterize the Portuguese verbs of movement, specifically in what concerns subcategorization properties, the modelization of prepositions in WordNet.PT (WN.PT) and their semantic representation at the lexical entry level in the GL framework, is proposed. The integration of prepositions in WN.PT follows previous research on ontological models for the representation of prepositions, namely in what concerns the concepts denoted by prepositions consensually adopted in traditional grammars and state of the art models. This results in a coherent and unified treatment of the semantically full prepositions that introduce verbal arguments but also of argument-marking prepositions. Using these levels and elements of representation, a complete representation of Portuguese verbs of movement is proposed, accounting for the percolation of information within the lexicon, for the impact of semantic lexicalization in the semantic and syntactic properties of verbs and for verbal co-hyponym compatibility. The recursive use of available lexical structures allows the percolation of information through the hyponymy trees and enables a coherent and economic codification of the information, including significant subcategorization properties. The resulting lexical structures demonstrate that hyponymy can replace a semantic type lattice in what concerns establishing and defining semantic properties by subtyping strategies. In addition, the permeability granted by the GL model principles, in particular underspecification and co-composition, assures the necessary context flexibility to explain the diversity of syntactic behaviors directly related to lexical semantics properties. For the definition of a computational lexicon that models the semantic and syntactic properties of lexical items, the integration of informational structures in wordnets is proposed: GL lexical structures provide the structured lexical entries, and WordNet, by its nature, provides the necessary lexical hierarchy that conveys the access to other structures in the lexicon. The integration of GL representation levels in a wordnet, namely argument structure, qualia structure and event structure, demonstrates how wordnets can support a finer-grained lexical description that provides the bases for accounting for several lexical semantic phenomena, without compromising the architecture of the model. The integration of argument structure information in WN.PT is achieved through the establishment of three new relations: SELECTS/ IS SELECTED BY relation; INCORPORATES/IS INCORPORATED IN relation and SELECTS BY DEFAULT/IS SELECTED BY DEFAULT BY relation. The integration of qualia role in wordnets is attained by associating lexical-conceptual relations to qualia roles, without any loss of information, in what consists of a simple and low cost process. The expression of event structure in wordnets is accomplished through a new set of features (Event type, Arguments, Subevents, Restrictio

Universidade de Lisboa: Repositório.UL