Search CORE

37 research outputs found

Platform for Full-Syntax Grammar Development Using Meta-grammar Constructs

Author: Horak Ales
Kadlec Vladimir
Publication venue: 'Tsinghua University Press'
Publication date: 01/10/2006
Field of study

PACLIC 20 / Wuhan, China / 1-3 November, 200

Vers la création d'un Verbnet du français

Author: Danlos Laurence
Nakamura Takuya
Pradet Quentin
Publication venue: HAL CCSD
Publication date: 01/07/2014
Field of study

International audienceVerbNet est une ressource lexicale pour les verbes anglais qui est bien utile pour le TAL grâce à sa large couverture et sa classification cohérente. Une telle ressource n'existe pas pour le français malgré quelques tentatives. Nous montrons comment adapter semi-automatiquement VerbNet en utilisant deux ressources lexicales existantes, le LVF (Les Verbes Français) et le LG (Lexique-Grammaire). Abstract. VerbNet is an English lexical resource that has proven useful for NLP due to its high coverage and coherent classification. Such a resource doesn't exist for French, despite some (mostly automatic and unsupervised) at-tempts. We show how to semi-automatically adapt VerbNet using existing lexical resources, namely LVF (Les Verbes Français) and LG (Lexique-Grammaire). Mots-clés : VerbNet, cadres de sous-catégorisations, rôles sémantiques

INRIA a CCSD electronic archive server

Intégration de VerbNet dans un réalisateur profond

Author: Galarreta-Piquette Daniel
Publication venue
Publication date: 01/08/2018
Field of study

La génération automatique de texte (GAT) a comme objectif de produire du texte compréhensible en langue naturelle à partir de données non-linguistiques. Les générateurs font essentiellement deux tâches : d’abord ils déterminent le contenu d’un message à communiquer, puis ils sélectionnent les mots et les constructions syntaxiques qui serviront à transmettre le message, aussi appellée la réalisation linguistique. Pour générer des textes aussi naturels que possible, un système de GAT doit être doté de ressources lexicales riches. Si on veut avoir un maximum de flexibilité dans les réalisations, il nous faut avoir accès aux différentes propriétés de combinatoire des unités lexicales d’une langue donnée. Puisque les verbes sont au coeur de chaque énoncé et qu’ils contrôlent généralement la structure de la phrase, il faudrait encoder leurs propriétés afin de produire du texte exploitant toute la richesse des langues. De plus, les verbes ont des propriétés de combinatoires imprévisibles, c’est pourquoi il faut les encoder dans un dictionnaire. Ce mémoire porte sur l’intégration de VerbNet, un dictionnaire riche de verbes de l’anglais et de leurs comportements syntaxiques, à un réalisateur profond, GenDR. Pour procéder à cette implémentation, nous avons utilisé le langage de programmation Python pour extraire les données de VerbNet et les manipuler pour les adapter à GenDR, un réalisateur profond basé sur la théorie Sens-Texte. Nous avons ainsi intégré 274 cadres syntaxiques à GenDR ainsi que 6 393 verbes de l’anglais.Natural language generation’s (NLG) goal is to produce understandable text from nonlinguistic data. Generation essentially consists in two tasks : first, determine the content of a message to transmit and then, carefully select the words that will transmit the desired message. That second task is called linguistic realization. An NLG system requires access to a rich lexical ressource to generate natural-looking text. If we want a maximum of flexibility in the realization, we need access to the combinatory properties of a lexical unit. Because verbs are at the core of each utterance and they usually control its structure, we should encode their properties to generate text representing the true richness of any language. In addition to that, verbs are highly unpredictible in terms of syntactic behaviours, which is why we need to store them into a dictionary. This work is about the integration of VerbNet, a rich lexical ressource on verbs and their syntactic behaviors, into a deep realizer called GenDR. To make this implementation possible, we have used the Python programming language to extract VerbNet’s data and to adapt it to GenDR. We have imported 274 syntactic frames and 6 393 verbs

Inducing Stereotypical Character Roles from Plot Structure

Author: Jahan Labiba
Publication venue: FIU Digital Commons
Publication date: 15/06/2021
Field of study

If we are to understand stories, we must understand characters: characters are central to every narrative and drive the action forward. Critically, many stories (especially cultural ones) employ stereotypical character roles in their stories for different purposes, including efficient communication among bundles of default characteristics and associations, ease understanding of those characters\u27 role in the overall narrative, and many more. These roles include ideas such as hero, villain, or victim, as well as culturally-specific roles such as, for example, the donor (in Russian tales) or the trickster (in Native American tales). My thesis aims to learn these roles automatically, inducing them from data using a clustering technique. The first step of learning character roles, however, is to identify which coreference chains correspond to characters, which are defined by narratologists as animate entities that drive the plot forward. The first part of my work has focused on this character identification problem, specifically focusing on the problem of animacy detection. Prior work treated animacy as a word-level property, and researchers developed statistical models to classify words as either animate or inanimate. I claimed this approach to the problem is ill-posed and presented a new hybrid approach for classifying the animacy of coreference chains that achieved state-of-the-art performance. The next step of my work is to develop approaches first to identify the characters and then a new unsupervised clustering approach to learn stereotypical roles. My character identification system consists of two stages: first, I detect animate chains from the coreference chains using my existing animacy detector; second, I apply a supervised machine learning model that identifies which of those chains qualify as characters. I proposed a narratologically grounded definition of character and built a supervised machine learning model with a small set of features that achieved state-of-the-art performance. In the last step, I successfully implemented a clustering approach with plot and thematic information to cluster the archetypes. This work resulted in a completely new approach to understanding the structure of stories, greatly advancing the state-of-the-art of story understanding