Search CORE

3,662 research outputs found

Combining Formal Concept Analysis and Translation to Assign Frames and Semantic Role Sets to French Verbs

Author: Falk Ingrid
Gardent Claire
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2013
Field of study

International audienceIn Natural Language Processing, verb classifications have been shown to be useful both theoretically (to capture syntactic and semantic generalisations about verbs) and practically (to support factorisation and the supervised learning of shallow semantic parsers). Acquiring such classifications manually is both costly and errror prone however. In this paper, we present a novel approach for automatically acquiring verb classifications. The approach uses FCA to build a concept lattice from existing linguistic resources; and stability and separation indices to extract from this lattice those concepts that most closely capture verb classes. The approach is evaluated on an established benchmark and shown to differ from previous approaches and in particular, from clustering approaches, in two main ways. First, it supports polysemy (because a verb may belong to several classes). Second, it naturally provides a syntactic and semantic characterisation of the verb classes produced (by creating concepts which systematically associate verbs with their syntactic and semantic attributes)

INRIA a CCSD electronic archive server

Combining Formal Concept Analysis and Translation to Assign Frames and Thematic Role Sets to French Verbs

Author: Falk Ingrid
Gardent Claire
Publication venue: HAL CCSD
Publication date: 17/10/2011
Field of study

International audienceWe present an application of Formal Concept Analysis in the domain of Natural Language Processing: We give a general overview of the framework, describe its goals, the data it is based on, the way it works and we illustrate the kind of data we expect as a result. More specifically, we examine the ability of the stability, separation and probability indices to select the most relevant concepts with respect to our FCA application. We show that the sum of stability and separation gives results close to those obtained when using the entire lattice

INRIA a CCSD electronic archive server

Lexical typology : a programmatic sketch

Author: Behrens Leila
Sasse Hans-Jürgen
Publication venue
Publication date: 01/01/1997
Field of study

The present paper is an attempt to lay the foundation for Lexical Typology as a new kind of linguistic typology.1 The goal of Lexical Typology is to investigate crosslinguistically significant patterns of interaction between lexicon and grammar

Hochschulschriftenserver - Universität Frankfurt am Main

D6.1: Technologies and Tools for Lexical Acquisition

Author: Abrate Matteo
Bacciu Clara
Bel Nuria
Caselli Tommaso
Gavrilidou Maria
Korhonen Anna
Monachini Monica
Padr? Muntsa
Poibeau Thierry
Prokopidis Prokopis
Quochi Valeria
Revilla Eva
Rimell Laura
Tesconi Maurizio
Publication venue
Publication date
Field of study

This report describes the technologies and tools to be used for Lexical Acquisition in PANACEA. It includes descriptions of existing technologies and tools which can be built on and improved within PANACEA, as well as of new technologies and tools to be developed and integrated in PANACEA platform. The report also specifies the Lexical Resources to be produced. Four main areas of lexical acquisition are included: Subcategorization frames (SCFs), Selectional Preferences (SPs), Lexical-semantic Classes (LCs), for both nouns and verbs, and Multi-Word Expressions (MWEs)

PUblication MAnagement

Recommended from our members

Acquiring and Harnessing Verb Knowledge for Multilingual Natural Language Processing

Author: Majewska Olga
Publication venue: University of Cambridge
Publication date: 01/02/2021
Field of study

Advances in representation learning have enabled natural language processing models to derive non-negligible linguistic information directly from text corpora in an unsupervised fashion. However, this signal is underused in downstream tasks, where they tend to fall back on superficial cues and heuristics to solve the problem at hand. Further progress relies on identifying and filling the gaps in linguistic knowledge captured in their parameters. The objective of this thesis is to address these challenges focusing on the issues of resource scarcity, interpretability, and lexical knowledge injection, with an emphasis on the category of verbs. To this end, I propose a novel paradigm for efficient acquisition of lexical knowledge leveraging native speakers’ intuitions about verb meaning to support development and downstream performance of NLP models across languages. First, I investigate the potential of acquiring semantic verb classes from non-experts through manual clustering. This subsequently informs the development of a two-phase semantic dataset creation methodology, which combines semantic clustering with fine-grained semantic similarity judgments collected through spatial arrangements of lexical stimuli. The method is tested on English and then applied to a typologically diverse sample of languages to produce the first large-scale multilingual verb dataset of this kind. I demonstrate its utility as a diagnostic tool by carrying out a comprehensive evaluation of state-of-the-art NLP models, probing representation quality across languages and domains of verb meaning, and shedding light on their deficiencies. Subsequently, I directly address these shortcomings by injecting lexical knowledge into large pretrained language models. I demonstrate that external manually curated information about verbs’ lexical properties can support data-driven models in tasks where accurate verb processing is key. Moreover, I examine the potential of extending these benefits from resource-rich to resource-poor languages through translation-based transfer. The results emphasise the usefulness of human-generated lexical knowledge in supporting NLP models and suggest that time-efficient construction of lexicons similar to those developed in this work, especially in under-resourced languages, can play an important role in boosting their linguistic capacity.ESRC Doctoral Fellowship [ES/J500033/1], ERC Consolidator Grant LEXICAL [648909

Apollo (Cambridge)

Automated Semantic Classification of French Verbs

Author: Falk Ingrid
Publication venue: HAL CCSD
Publication date: 27/06/2008
Field of study

The aim of this work is to explore (semi-)automatic means to create a Levin-type classification of French verbs, suitable for Natural Language Processing. For English, a classification based on Levin's method is VerbNet (Kipper 2005). VerbNet is an extensive digital verb lexicon which systematically extends Levin's classes while ensuring that class members have a common semantics and share a common set of syntactic frames and thematic roles. In this work we reorganise the verbs from three French syntax lexicons, namely Volem, the Grammar-Lexicon (Ladl) and Dicovalence, into VerbNet-like verb classes using the technique of Formal Concept Analysis. We automatically acquire syntactic-semantic verb class and diathesis alternation information. We create large scale verb classes and compare their verb and frame distributions to those of VerbNet. We discuss possible evaluation schemes and finally focus on an evaluation methodology with respect to VerbNet, of which we present the theoretical motivation and analyse the feasibility on a small hand-built example

INRIA a CCSD electronic archive server

Review of Laurence R. Horn and Yasuhiko Kato (eds) (2000) Negation and polarity: syntactic and semantic perspectives. (Oxford University Press.)

Author: Rowlett PA
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2002
Field of study

University of Salford Institutional Repository

Crossref

Representation and parsing of multiword expressions

Author
Publication venue: Language Science Press
Publication date: 01/04/2020
Field of study

This book consists of contributions related to the definition, representation and parsing of MWEs. These reflect current trends in the representation and processing of MWEs. They cover various categories of MWEs such as verbal, adverbial and nominal MWEs, various linguistic frameworks (e.g. tree-based and unification-based grammars), various languages including English, French, Modern Greek, Hebrew, Norwegian), and various applications (namely MWE detection, parsing, automatic translation) using both symbolic and statistical approaches

Directory of Open Access Books (DOAB)

Current trends

Author
Publication venue
Publication date: 01/01/2019
Field of study

Deep parsing is the fundamental process aiming at the representation of the syntactic structure of phrases and sentences. In the traditional methodology this process is based on lexicons and grammars representing roughly properties of words and interactions of words and structures in sentences. Several linguistic frameworks, such as Headdriven Phrase Structure Grammar (HPSG), Lexical Functional Grammar (LFG), Tree Adjoining Grammar (TAG), Combinatory Categorial Grammar (CCG), etc., offer different structures and combining operations for building grammar rules. These already contain mechanisms for expressing properties of Multiword Expressions (MWE), which, however, need improvement in how they account for idiosyncrasies of MWEs on the one hand and their similarities to regular structures on the other hand. This collaborative book constitutes a survey on various attempts at representing and parsing MWEs in the context of linguistic theories and applications

Institutional Repository of the Freie Universität Berlin