74 research outputs found

    Vers un outil de visualisation de la dynamique textuelle : l'exemple des phénomènes citationnels et modaux

    Get PDF
    article mis en formeWe propose a methodological framework for analyzing and representing the concept of commitment, which is one of the features characterizing textual structure. We emphasize the hierarchical structure of textual segments commitment conveys to. We represent it first as a tree and then as a graph. The latter enables us to access the enunciative textual dynamics, as it shows the path followed through different discursive levels during the syntagmatic reading of a text. Our approach is well-founded for textual navigation platforms. MOTS-CLÉS : linguistique textuelle, prise en charge énonciative, représentation sémantique.Nous proposons d'exposer ici une méthode d'investigation du phénomène des différentes prises en charge énonciatives (plus particulièrement, le cas des citations) et modales à l'œuvre dans les textes. Nous mettons l'accent sur la structure hiérarchisée des segments textuels qui en résulte ; nous la représentons d'une part sous forme d'arbre et d'autre part sous forme de graphe. Ce dernier permet d'appréhender la dynamique interprétative d'un texte comme un cheminement qui s'opère entre différents niveaux de discours au fur et à mesure de la lecture syntagmatique. Cette approche prend toute sa légitimité dans le cadre d'une plate-forme de navigation textuelle

    Quelques exemples d'utilisation des S_langages pour le traitement de la temporalité en linguistique

    No full text
    National audienceCet article illustre sur plusieurs exemples l'intérêt des S-langages dans le traitement de la temporalit

    L’émotion à un niveau textuel : la fonction structurante des émotions observée à partir d’annotations

    Get PDF
    Nous nous intéressons ici à l’analyse des émotions du point de vue de la linguistique textuelle, c’est-à-dire en mettant au jour des phénomènes de structuration reposant sur l’identification de blocs de phrases contiguës liées entre elles selon un critère de liage de nature sémantique. Ce critère est ici celui des émotions – et leurs éventuelles causes et conséquences si elles sont exprimées – ressenties par des entités apparaissant au fur et à mesure d’un texte. Nous présentons tout d’abord brièvement les travaux linguistiques et psycholinguistiques sur lesquels notre approche prend appui. Nous décrivons ensuite notre méthodologie d’analyse linguistique qui repose sur l’emploi d’un schéma d’annotation permettant de repérer un ensemble de marqueurs linguistiques non strictement lexicaux mis en œuvre dans la dénotation d’une émotion. Nous proposons alors un mode de représentation d’un texte s’appuyant sur ce schéma et permettant de visualiser des blocs de phrases liées sémantiquement entre elles. Nous illustrons enfin l’application de ce mode de représentation à deux exemples d’extraits de textes, l’un issu de la presse pour enfants et l’autre de la littérature jeunesse. Nous discutons finalement des apports, limites et perspectives dégagés par la mise en œuvre de notre méthodologie d’analyse du pouvoir structurant des émotions présentes dans un texte.We are interested in the analysis of emotions from the point of view of textual linguistics, i.e. by revealing structuring phenomena based on the identification of blocks of contiguous sentences linked by a semantic criterion. It is the criterion of emotions – and their possible causes and consequences when they are expressed – felt by entities appearing through a text. We first briefly present the linguistic and psycholinguistic works on which our approach is based. We then describe our linguistic analysis methodology, which is based on the use of an annotation scheme to identify a set of non-lexical linguistic markers used in the denotation of an emotion. We then propose a way of representing a text based on the previous one and allowing the visualization of blocks of semantically linked sentences. We illustrate the application of this representation mode with two examples of texts, one from the children’s press and the other from children’s literature. Finally, we discuss the contributions, limits and perspectives opened by the implementation of our methodology of analysis of the structuring power of emotions present in a text

    "Confortation": About a New Category for Analyzing Biomedical Texts

    Get PDF
    International audienceIn this paper we present a new approach to the expression of certainty and uncertainty in scientific experimental articles. This will permit to ascertain the validity of knowledge extracted from biological literature and used to automatically populate a domain ontology. We argue that lexical terms such as show, find, observe... express a semantic category different from the one characterized by markers such as demonstrate, validate, support... We name the latter category “confortation” as it conveys a notion of strengthening and we propose five other semantic categories: lack of knowledge, objects of study, hypothesis, observations, and general knowledge. This last category and the linguistic phenomenon of reported speech are respectively examined as consensual truth and as knowledge reported from identified scientific sources

    Caractérisation de registres de langue par extraction de motifs séquentiels émergents

    Get PDF
    International audienceLanguage registers are the highly perceptible characteristic of written or spoken communication. In this paper we present a methodology to automatically characterize language registers using statistical tool named "emerging sequential patterns". Our approach is presented in two steps : the first one exhibits the relevance of the chosen statistical tool from artificial texts ; the second one shows that the characteristic patterns of the language registers from real data can be extracted by using this statistical tool. Experimental results show the quality of our methodology

    Towards the Automatic Processing of Language Registers: Semi-supervisedly Built Corpus and Classifier for French

    Get PDF
    International audienceLanguage registers are a strongly perceptible characteristic of texts and speeches. However, they are still poorly studied in natural language processing. In this paper, we present a semi-supervised approach which jointly builds a corpus of texts labeled in registers and an associated classifier. This approach relies on a small initial seed of expert data. After massively retrieving web pages, it iteratively alternates the training of an intermediate classifier and the annotation of new texts to augment the labeled corpus. The approach is applied to the casual, neutral, and formal registers, leading to a 750M word corpus and a final neural classifier with an acceptable performance

    Covering various Needs in Temporal Annotation: a Proposal of Extension of ISO TimeML that Preserves Upward Compatibility

    Get PDF
    International audienceThis paper reports a critical analysis of the ISO TimeML standard, in the light of several experiences of temporal annotation that were conducted on spoken French. It shows that the norm suffers from weaknesses that should be corrected to fit a larger variety of needs in NLP and in corpus linguistics. We present our proposition of some improvements of the norm before it will be revised by the ISO Committee in 2017. These modifications concern mainly (1) Enrichments of well identified features of the norm: temporal function of TIMEX time expressions, additional types for TLINK temporal relations; (2) Deeper modifications concerning the units or features annotated: clarification between time and tense for EVENT units, coherence of representation between temporal signals (the SIGNAL unit) and TIMEX modifiers (the MOD feature); (3) A recommendation to perform temporal annotation on top of a syntactic (rather than lexical) layer (temporal annotation on a treebank)

    Towards the Automatic Processing of Language Registers: Semi-supervisedly Built Corpus and Classifier for French

    Get PDF
    International audienceLanguage registers are a strongly perceptible characteristic of texts and speeches. However, they are still poorly studied in natural language processing. In this paper, we present a semi-supervised approach which jointly builds a corpus of texts labeled in registers and an associated classifier. This approach relies on a small initial seed of expert data. After massively retrieving web pages, it iteratively alternates the training of an intermediate classifier and the annotation of new texts to augment the labeled corpus. The approach is applied to the casual, neutral, and formal registers, leading to a 750M word corpus and a final neural classifier with an acceptable performance

    Linguistique et recherche d’information: la problématique du temps

    No full text
    International audienc
    • …