207 research outputs found
A discursive analysis of itineraries in an historical and regional corpus of travels: syntax, semantics, and pragmatics in a unified type theoretical framework
International audienceIn this paper we will discuss the application of (Segmented) Discourse Representation Theory and the Generative Lexicon to the analysis of a historical French corpus of itineraries in the Pyre Ìne Ìes. Our research will focus in particular on how type coercion (Pustejovsky, 1995) can help us give a correct analysis of cases of so-called "fictive motion" (Talmy, 1999), which is evident is phrases like. (1) The road runs along the coast for two hours. (2) The path descended abruptly. This case is particular in that an entity (which is considered immobile and which, in the context, defines a path) is the subject of a movement verb and that the combination is interpreted as a generic statement about the nature of this path, without any movement necessarily taking place
A quantitative view of feedback lexical markers in conversational French
International audienceThis paper presents a quantitative description of the lexical items used for linguistic feedback in the Corpus of Interactional Data (CID). The paper includes the raw figures for feedback lexical item as well as more detailed figures concerning interindividual variability. This effort is a first step before a broader analysis including more discourse situations and featuring communicative function annotation
Cinquante ans de recherches au Laboratoire Parole et Langage : vers une linguistique des interfaces
Le LPL, Laboratoire Parole et Langage (UMR 7309 CNRS/Aix-Marseille UniversitĂ©) fĂȘte en 2022 cinquante ans dâassociation au CNRS. Câest en rĂ©alitĂ© de soixante ans dâexistence en tant que laboratoire dâuniversitĂ© dont peut sâenorgueillir le LPL en 2022. Les membres du laboratoire se sont mobilisĂ©s Ă cette occasion pour se raconter, exposer idĂ©es, concepts et rĂ©sultats, Ă travers un prisme qui leur est propre et qui est le fruit de plus de cinquante ans dâinterrogation des marges, bordures et au..
Semi-automatic Endogenous Enrichment of Collaboratively Constructed Lexical Resources: Piggybacking onto Wiktionary
International audienceThe lack of large-scale, freely available and durable lexical resources, and the consequences for NLP, is widely acknowledged but the attempts to cope with usual bottlenecks preventing their development often result in dead-ends. This article introduces a language-independent, semi-automatic and endogenous method for enriching lexical resources, based on collaborative editing and random walks through existing lexical relationships, and shows how this approach enables us to overcome recurrent impediments. It compares the impact of using different data sources and similarity measures on the task of improving synonymy networks. Finally, it defines an architecture for applying the presented method to Wiktionary and explains how it has been implemented
Un calcul de termes typés pour la pragmatique lexicale: chemins et voyageurs fictifs dans un corpus de récits de voyages
International audienceThis work is part of the automated analysis of travel stories corpus. To do so, we refine Mon- tague semantics, to model the adaptation of word meaning to the context in which they appear. Here we study construction like 'the path goes down for half an hour' in which the path introduces a virtual traveller following it, extending ideas of the last author with Bassac, Mery. The introduction of a traveller relies on type raising sa- tisfies the following requirements : the quantification binding the traveller has the widest scope, and properties of the path do not apply to the traveller, be it virtual. This semantical analysis (actually its translation in λ-DRT) is already implemented for a part of the Grail lexicon.Ce travail s'inscrit dans l'analyse automatique d'un corpus de reÌcits de voyage. AÌ cette fin,nous raffinons la seÌmantique de Montague pour rendre compte des pheÌnomeÌnes d'adaptation du sens des mots au contexte dans lequel ils apparaissent. Ici, nous modeÌlisons les constructions de type 'le chemin descend pendant une demi-heure' ouÌ ledit chemin introduit un voyageur fictif qui le parcourt, en eÌtendant des ideÌes que le dernier auteur a deÌveloppeÌ avec Bassac et Mery. Cette introduction du voyageur utilise la monteÌe de type afin que le quantificateur introduisant le voyageur porte sur toute la phrase et que les proprieÌteÌs du chemin ne deviennent pas des proprieÌteÌs du voyageur, fuÌt-il fictif. Cette analyse seÌmantique (ou plutoÌt sa traduction en lambda-DRT) est d'ores et deÌjaÌ implanteÌe pour une partie du lexique de Grail
Downward compatible revision of dialogue annotation
This paper discusses some aspects of revising the ISO standard for dialogue act annotation (ISO 24617-2). The revision is aimed at making annotations using the ISO scheme more accurate and at providing more powerful tools for building natural language based dialogue systems, without invalidating the annotated resources that have been built, with the current version of the standard. In support of the revision of the standard, an analysis is provided of the downward compatibility of a revised annotation scheme with the original scheme at the levels of abstract syntax, concrete syntax, and semantics of annotations
LexFr: Adapting the LexIt Framework to Build a Corpus-Based French Subcategorization Lexicon
This paper introduces LexFr , a corpus-based French lexical resource built by adapting the framework LexIt , originally developed to describe the combinatorial potential of Italian predicates. As in the original framework, the behavior of a group of target predicates is characterized by a series of syntactic (i.e., subcategorization frames) and semantic (i.e., selectional preferences) statistical information (a.k.a. distributional profiles ) whose extraction process is mostly unsupervised. The first release of LexFr includes information for 2,493 verbs, 7,939 nouns and 2,628 adjectives. In these pages we describe the adaptation process and evaluated the final resource by comparing the information collected for 20 test verbs against the information available in a gold standard dictionary. In the best performing setting, we obtained 0.74 precision, 0.66 recall and 0.70 F-measure.This paper introduces LexFr, a corpus-based French lexical resource built by adapting the framework LexIt, originally developed to describe the combinatorial potential of Italian predicates. As in the original framework, the behavior of a group of target predicates is characterized by a series of syntactic (i.e., subcategorization frames) and semantic (i.e., selectional preferences) statistical information (a.k.a. distributional profiles) whose extraction process is mostly unsupervised. The first release of LexFr includes information for 2,493 verbs, 7,939 nouns and 2,628 adjectives. In these pages we describe the adaptation process and evaluated the final resource by comparing the information collected for 20 test verbs against the information available in a gold standard dictionary. In the best performing setting, we obtained 0.74 precision, 0.66 recall and 0.70 F-measure
Quantifying the Flexibility of Real-Time Systems
International audienceIn this paper we define the flexibility of a system as its capability to schedule a new task. We present an approach to quantify the flexibility of a system. More importantly, we show that it is possible under certain conditions to identify the task that will directly induce the limitations on a possible software update. If performed at design time, such a result can be used to adjust the system design by giving more slack to the limiting task. We illustrate how these results apply to a simple system
Aix Map Task corpus:The French multimodal corpus of task-oriented dialogue
International audienceThis paper introduces the Aix Map Task corpus, a corpus of audio and video recordings of task-oriented dialogues. It was modelled afterthe original HCRC Map Task corpus. Lexical material was designed for the analysis of speech and prosody, as described in (Astésanoet al., 2007). The design of the lexical material, the protocol and some basic quantitative features of the existing corpus are presented.The corpus was collected under two communicative conditions, one audio-only condition and one face-to-face condition. The recordingstook place in a studio and a sound attenuated booth respectively, with head-set microphones (and in the face-to-face condition with twovideo cameras). The recordings have been segmented into Inter-Pausal-Units and transcribed using transcription conventions containingactual productions and canonical forms of what was said. It is made publicly available online
- âŠ