755 research outputs found

    Taking Genre into account when Analyzing Conceptual Relation Patterns

    Get PDF
    International audienceThis paper uses a corpus study to investigate the influence of text genre on the frequency and semantic interpretation of certain pattern/concept relations. In linking pattern/concept relations to text genre, the study identifies three types of dependency: weak dependency, where the relation appears in almost any kind of text ; complete dependency, where it is strongly linked to a particular text or group of related texts ; and dependency in terms of text genre. The particular examples that form the basis of the study are meronymic chez, which is found to have a significant dependency to didactic texts in the natural sciences ; comme as a marker of hypernymy and co-hyponymy, which has a weaker, but observable dependency to technical and didactic genres ; nominal anaphora involving hypernyms, where no consistent conclusions can be reached ; and meronymic avec, where the significant factor is shown to be communicative objective rather than domain (subject matter). The relevance of such studies to Natural Language Processing are discussed, and some pointers to further research are indicated

    Linguistic Markers of Lexical and Textual Relations in Technical Documents

    Get PDF
    International audienceThis chapter proposes a number of linguistic " handles " for the description of technical documents, at a lexical level (terminology) and at a textual level (discourse coherence). Examples are given of uses of such insights in document production and management, in particular via document engineering systems. We provide a number of linguistic " handles " for the description of technical documents. Such insights into the " inner workings " of texts may be harnessed in various ways in the production and management of technical documents; we show some applications in document engineering, in systems designed to facilitate access to information. Our focus is on surface markers, i.e. observable text features identified through corpus analysis, signalling the kind of relations between lexical items used in building terminologies (such as generic/specific, see section 1), or relations between text segments involved in discourse coherence (such as theme, or rhetorical relations, see section 2). We insist on the relevance of the notion of genre when working with technical documents, and on the genre-dependent nature of our linguistic markers

    From Text to Knowledge with Graphs: modelling, querying and exploiting textual content

    Full text link
    This paper highlights the challenges, current trends, and open issues related to the representation, querying and analytics of content extracted from texts. The internet contains vast text-based information on various subjects, including commercial documents, medical records, scientific experiments, engineering tests, and events that impact urban and natural environments. Extracting knowledge from this text involves understanding the nuances of natural language and accurately representing the content without losing information. This allows knowledge to be accessed, inferred, or discovered. To achieve this, combining results from various fields, such as linguistics, natural language processing, knowledge representation, data storage, querying, and analytics, is necessary. The vision in this paper is that graphs can be a well-suited text content representation once annotated and the right querying and analytics techniques are applied. This paper discusses this hypothesis from the perspective of linguistics, natural language processing, graph models and databases and artificial intelligence provided by the panellists of the DOING session in the MADICS Symposium 2022

    Learning to distinguish hypernyms and co-hyponyms

    Get PDF
    This work is concerned with distinguishing different semantic relations which exist between distributionally similar words. We compare a novel approach based on training a linear Support Vector Machine on pairs of feature vectors with state-of-the-art methods based on distributional similarity. We show that the new supervised approach does better even when there is minimal information about the target words in the training data, giving a 15% reduction in error rate over unsupervised approaches

    What is language for sociolinguists? The variationist, ethnographic, and conversation-analytic ontologies of language

    Get PDF
    The present investigation explores the language definitions (i. e. the language ontologies) that have emerged in the field of sociolinguistics. In general, it examines three types of sociolin-guistic studies: Labovian sociolinguistics (Labov 1972), the Ethnography of Communication (Gumperz/Hymes 1964) and Conversation Analysis (Sacks 1992). Firstly, it offers an account on the ontology of language developed by Chomskyian linguistics (1986) which is used as a starting point to contrast the three sociolinguistics’ language ontologies. Then, the paper pre-sents Labov’s ontology of language (Labov 1977), the criticism that it has faced and examines proposals that aim to integrate social facts and linguistic structure. With regard to the Ethnog-raphy of Communication, accounts about its ontology of language (Hymes 1974, 1986) and its ontology of culture (Sapir 1921; Hymes 1972) are presented and a possible explanation about the relationship between language and culture is offered. With respect to Conversation Analysis, its ontology of language is presented (Ochs et al. 1996) as well as its analytic in-sight and an account about grammar as an interactional resource is given. The final section proposes that, for these three types of sociolinguistics, “language” is a social, functional and behavioural entity which is socially and behaviourally structured. “Language” transmits social meanings, reflects the social order and expresses the identity of its speakers

    Interactive Knowledge Construction in the Collaborative Building of an Encyclopedia

    Get PDF
    International audienceOne of the major challenges of Applied Artificial Intelligence is to provide environments where high level human activities like learning, constructing theories or performing experiments, are enhanced by Artificial Intelligence technologies. This paper starts with the description of an ambitious project: EnCOrE2. The specific real world EnCOrE scenario, significantly representing a much wider class of potential applicative contexts, is dedicated to the building of an Encyclopedia of Organic Chemistry in the context of Virtual Communities of experts and students. Its description is followed by a brief survey of some major AI questions and propositions in relation with the problems raised by the EnCOrE project. The third part of the paper starts with some definitions of a set of “primitives” for rational actions, and then integrates them in a unified conceptual framework for the interactive construction of knowledge. To end with, we sketch out protocols aimed at guiding both the collaborative construction process and the collaborative learning process in the EnCOrE project.The current major result is the emerging conceptual model supporting interaction between human agents and AI tools integrated in Grid services within a socio-constructivist approach, consisting of cycles of deductions, inductions and abductions upon facts (the shared reality) and concepts (their subjective interpretation) submitted to negotiations, and finally converging to a socially validated consensus
    • …
    corecore