2,984 research outputs found

    Conceptual graph-based knowledge representation for supporting reasoning in African traditional medicine

    Get PDF
    Although African patients use both conventional or modern and traditional healthcare simultaneously, it has been proven that 80% of people rely on African traditional medicine (ATM). ATM includes medical activities stemming from practices, customs and traditions which were integral to the distinctive African cultures. It is based mainly on the oral transfer of knowledge, with the risk of losing critical knowledge. Moreover, practices differ according to the regions and the availability of medicinal plants. Therefore, it is necessary to compile tacit, disseminated and complex knowledge from various Tradi-Practitioners (TP) in order to determine interesting patterns for treating a given disease. Knowledge engineering methods for traditional medicine are useful to model suitably complex information needs, formalize knowledge of domain experts and highlight the effective practices for their integration to conventional medicine. The work described in this paper presents an approach which addresses two issues. First it aims at proposing a formal representation model of ATM knowledge and practices to facilitate their sharing and reusing. Then, it aims at providing a visual reasoning mechanism for selecting best available procedures and medicinal plants to treat diseases. The approach is based on the use of the Delphi method for capturing knowledge from various experts which necessitate reaching a consensus. Conceptual graph formalism is used to model ATM knowledge with visual reasoning capabilities and processes. The nested conceptual graphs are used to visually express the semantic meaning of Computational Tree Logic (CTL) constructs that are useful for formal specification of temporal properties of ATM domain knowledge. Our approach presents the advantage of mitigating knowledge loss with conceptual development assistance to improve the quality of ATM care (medical diagnosis and therapeutics), but also patient safety (drug monitoring)

    Abstract syntax as interlingua: Scaling up the grammatical framework from controlled languages to robust pipelines

    Get PDF
    Syntax is an interlingual representation used in compilers. Grammatical Framework (GF) applies the abstract syntax idea to natural languages. The development of GF started in 1998, first as a tool for controlled language implementations, where it has gained an established position in both academic and commercial projects. GF provides grammar resources for over 40 languages, enabling accurate generation and translation, as well as grammar engineering tools and components for mobile and Web applications. On the research side, the focus in the last ten years has been on scaling up GF to wide-coverage language processing. The concept of abstract syntax offers a unified view on many other approaches: Universal Dependencies, WordNets, FrameNets, Construction Grammars, and Abstract Meaning Representations. This makes it possible for GF to utilize data from the other approaches and to build robust pipelines. In return, GF can contribute to data-driven approaches by methods to transfer resources from one language to others, to augment data by rule-based generation, to check the consistency of hand-annotated corpora, and to pipe analyses into high-precision semantic back ends. This article gives an overview of the use of abstract syntax as interlingua through both established and emerging NLP applications involving GF

    Handbook of Lexical Functional Grammar

    Get PDF
    Lexical Functional Grammar (LFG) is a nontransformational theory of linguistic structure, first developed in the 1970s by Joan Bresnan and Ronald M. Kaplan, which assumes that language is best described and modeled by parallel structures representing different facets of linguistic organization and information, related by means of functional correspondences. This volume has five parts. Part I, Overview and Introduction, provides an introduction to core syntactic concepts and representations. Part II, Grammatical Phenomena, reviews LFG work on a range of grammatical phenomena or constructions. Part III, Grammatical modules and interfaces, provides an overview of LFG work on semantics, argument structure, prosody, information structure, and morphology. Part IV, Linguistic disciplines, reviews LFG work in the disciplines of historical linguistics, learnability, psycholinguistics, and second language learning. Part V, Formal and computational issues and applications, provides an overview of computational and formal properties of the theory, implementations, and computational work on parsing, translation, grammar induction, and treebanks. Part VI, Language families and regions, reviews LFG work on languages spoken in particular geographical areas or in particular language families. The final section, Comparing LFG with other linguistic theories, discusses LFG work in relation to other theoretical approaches

    Learning Sentence-internal Temporal Relations

    Get PDF
    In this paper we propose a data intensive approach for inferring sentence-internal temporal relations. Temporal inference is relevant for practical NLP applications which either extract or synthesize temporal information (e.g., summarisation, question answering). Our method bypasses the need for manual coding by exploiting the presence of markers like after", which overtly signal a temporal relation. We first show that models trained on main and subordinate clauses connected with a temporal marker achieve good performance on a pseudo-disambiguation task simulating temporal inference (during testing the temporal marker is treated as unseen and the models must select the right marker from a set of possible candidates). Secondly, we assess whether the proposed approach holds promise for the semi-automatic creation of temporal annotations. Specifically, we use a model trained on noisy and approximate data (i.e., main and subordinate clauses) to predict intra-sentential relations present in TimeBank, a corpus annotated rich temporal information. Our experiments compare and contrast several probabilistic models differing in their feature space, linguistic assumptions and data requirements. We evaluate performance against gold standard corpora and also against human subjects

    Processing temporal information in unstructured documents

    Get PDF
    Tese de doutoramento, Informática (Ciência da Computação), Universidade de Lisboa, Faculdade de Ciências, 2013Temporal information processing has received substantial attention in the last few years, due to the appearance of evaluation challenges focused on the extraction of temporal information from texts written in natural language. This research area belongs to the broader field of information extraction, which aims to automatically find specific pieces of information in texts, producing structured representations of that information, which can then be easily used by other computer applications. It has the potential to be useful in several applications that deal with natural language, given that many languages, among which we find Portuguese, extensively refer to time. Despite that, temporal processing is still incipient for many language, Portuguese being one of them. The present dissertation has various goals. On one hand, it addresses this current gap, by developing and making available resources that support the development of tools for this task, employing this language, and also by developing precisely this kind of tools. On the other hand, its purpose is also to report on important results of the research on this area of temporal processing. This work shows how temporal processing requires and benefits from modeling different kinds of knowledge: grammatical knowledge, logical knowledge, knowledge about the world, etc. Additionally, both machine learning methods and rule-based approaches are explored and used in the development of hybrid systems that are capable of taking advantage of the strengths of each of these two types of approach.O processamento de informação temporal tem recebido bastante atenção nos últimos anos, devido ao surgimento de desafios de avaliação focados na extração de informação temporal de textos escritos em linguagem natural. Esta área de investigação enquadra-se no campo mais lato da extração de informação, que visa encontrar automaticamente informação específica presente em textos, produzindo representações estruturadas da mesma, que podem depois ser facilmente utilizadas por outras aplicações computacionais. Tem o potencial de ser útil em diversas aplicações que lidam com linguagem natural, dado o caráter quase ubíquo da referência ao tempo cronólogico em muitas línguas, entre as quais o Português. Apesar de tudo, o processamento temporal encontra-se ainda incipiente para bastantes línguas, sendo o Português uma delas. A presente dissertação tem vários objetivos. Por um lado vem colmatar esta lacuna existente, desenvolvendo e disponibilizando recursos que suportam o desenvolvimento de ferramentas para esta tarefa, utilizando esta língua, e desenvolvendo também precisamente este tipo de ferramentas. Por outro serve também para relatar resultados importantes da pesquisa nesta área do processamento temporal. Neste trabalho, mostra- -se como o processamento temporal requer e beneficia da modelação de conhecimento de diversos níveis: gramatical, lógico, acerca do mundo, etc. Adicionalmente, são explorados tanto métodos de aprendizagem automática como abordagens baseadas em regras, desenvolvendo-se sistemas híbridos capazes de tirar partido das vantagens de cada um destes dois tipos de abordagem.Fundação para a Ciência e a Tecnologia (FCT, SFRH/BD/40140/2007

    Frame semantics for the field of climate change : d iscovering frames based on chinese and english terms

    Full text link
    La plupart des dictionnaires spécialisés de termes environnementaux en mandarin sont des dictionnaires papier, compilés et révisés il y a plus de dix ans, et contiennent principalement des termes nominaux. Les informations terminologiques se limitent aux connaissances véhiculées par le terme et son ou ses équivalents anglais. Pour les lecteurs qui souhaitent connaître les propriétés sémantiques ou syntaxiques des termes et pour les lecteurs qui veulent voir l’usage des termes dans des contextes réels de textes spécialisés, les informations fournies par les dictionnaires existants sont insuffisantes. Dans cette recherche, nous avons compilé une ressource terminologique en ligne du mandarin, décrivant les termes verbaux chinois dans le domaine du changement climatique. Cette ressource comble certaines des lacunes des dictionnaires environnementaux mandarin existants, en révélant le(s) sens du terme à travers la(les) structure(s) actantielle(s) et en montrant, à travers des contextes annotés, les propriétés sémantiques et syntaxiques du terme ainsi que ses usages pratiques dans des textes spécialisés. Cette ressource répondra mieux aux besoins du public. La base théorique qui sous-tend cette recherche est la Sémantique des cadres (Fillmore, 1976, 1977, 1982, 1985; Fillmore & Atkins, 1992), et le FrameNet construit à partir de celle-ci. L’objectif principal de cette recherche est de découvrir et de définir des cadres sémantiques chinois dans le domaine du changement climatique, et d’établir des relations entre les cadres chinois définis. Les cadres sémantiques chinois sont découverts à l’aide de la méthodologie du dictionnaire environnemental multilingue DiCoEnviro (et de sa ressource d’accompagnement Framed DiCoEnviro) (L’Homme, 2018; L’Homme et al., 2020). Afin de rendre cette méthodologie applicable à une langue sino-tibétaine, le chinois, nous avons modifié et adapté cette méthodologie pour qu’elle convienne à la description des termes chinois et à la définition des cadres sémantiques chinois. Certaines de ces modifications et adaptations sont basées sur le Chinese FrameNet (CFN) (Liu & You, 2015). Afin de découvrir les cadres sémantiques chinois, un corpus monolingue en chinois mandarin sur le changement climatique (MCCC) a d’abord été compilé. Ce corpus contient 224 textes iv authentiques chinois spécialisés dans le domaine du changement climatique, qui totalisent 1,228,333 caractères chinois, soit 547,592 mots chinois. Puis, les termes candidats ont été automatiquement extraits du MCCC à l’aide du logiciel de gestion et d’analyse de corpus – Sketch Engine. Après une analyse et une validation manuelle, nous avons déterminé quels termes candidats sont des termes réels. Par la suite, la structure actancielle de chaque terme a été écrite en analysant les contextes où le terme apparaît. Ensuite, chaque sens d’un terme polysémique a été placé dans une entrée séparée et 16-20 contextes ont été sélectionnés pour chaque entrée. Puis, chaque contexte a été annoté en fonction de trois couches – structure sémantique, fonction syntaxique et groupe syntaxique. Ensuite, les termes ont été classés en fonction des scénarios qu’ils évoquent. Les termes qui dépeignent la même scène ou situation dans le domaine du changement climatique, qui ont une structure actantielle similaire et qui partagent la majorité des circonstants sont classés dans un seul cadre sémantique (critères basés sur le projet DiCoEnviro (L’Homme, 2018; L’Homme et al., 2020)). Après avoir identifié les cadres sémantiques chinois, chaque cadre a été défini. Enfin, les cadres chinois découverts ont été reliés selon les huit types de relations entre cadres proposés par Ruppenhofer et al. (2016). Pour être affichés en ligne, les entrées de termes et les cadres sémantiques ont été encodés dans des fichiers XML. Guidés par cette méthodologie de recherche, nous avons finalement relevé 23 cadres sémantiques chinois et nous les avons définis. Le résultat final de cette recherche est une ressource terminologique en chinois mandarin basée sur des cadres et spécialisée dans le domaine du changement climatique. Cette ressource terminologique se compose de deux parties. La première partie est la description d’un total de 39 termes verbaux chinois. Chaque sens d’un terme verbal polysémique étant placé dans une entrée séparée, il y a au total 59 entrées (chaque entrée contient la structure actantielle et les contextes annotés). Au total, 1,027 contextes ont été annotés. La deuxième partie de cette ressource présente les 23 cadres sémantiques chinois identifiés ainsi que les relations entre les cadres.Most of the existing Mandarin Chinese specialised dictionaries of environmental terms are paper dictionaries, compiled and revised more than ten years ago, and contain mainly noun terms. Terminological information is restricted to knowledge conveyed by the term and its English equivalent(s). For readers who want to learn about semantic or syntactic properties of terms and for readers who want to see usage of terms in real contexts of specialised texts, information provided in existing dictionaries is insufficient. In this research, we compiled an online Mandarin Chinese terminological resource, describing Chinese verb terms in the field of climate change. This resource makes up for some of the deficiencies of existing Chinese environmental dictionaries, revealing meaning(s) of the term through actantial structure(s) and showing, through annotated contexts, semantic and syntactic properties of the term as well as its practical usages in specialised texts. This resource better meets the needs of the audience. The theoretical basis underpinning this research is Frame Semantics (Fillmore, 1976, 1977, 1982, 1985; Fillmore & Atkins, 1992), and the FrameNet built from it. The main objective of this research is to discover and define Chinese semantic frames in the field of climate change, and to establish relations between the Chinese frames defined. The Chinese semantic frames are discovered with the help of the methodology of the multilingual environmental dictionary DiCoEnviro (and its accompanying resource Framed DiCoEnviro) (L’Homme, 2018; L’Homme et al., 2020). In order to make this methodology applicable to a Sino-Tibetan language, Chinese, we modified and adapted this methodology to suit the description of Chinese terms and definition of Chinese semantic frames. Some of the changes and adaptations are based on the Chinese FrameNet (CFN) (Liu & You, 2015). In order to discover Chinese semantic frames, a monolingual Mandarin (Chinese) Climate Change Corpus (MCCC) was first compiled. This corpus contains 224 authentic Chinese specialised texts in the field of climate change, totaling 1,228,333 Chinese characters, which is 547,592 Chinese words. Following this, candidate terms were automatically extracted from MCCC using the corpus ii management and analysing software – Sketch Engine. After manual analysis and validation, which of the candidate terms are true terms was clarified. Subsequently, the actantial structure of each term was written by analysing the contexts where the term occurs. Next, each sense of a polysemous term was placed in a separate entry and 16-20 contexts were selected for each entry. Then, each context was annotated in terms of three layers – semantic structure, syntactic function and syntactic group. After this, the terms were classified according to the scenarios they evoke. Terms that depict the same scene or situation in the field of climate change, have similar actantial structure, and share the majority of circumstants are categorised into one semantic frame (criteria based on the project DiCoEnviro (L’Homme, 2018; L’Homme et al., 2020)). After Chinese semantic frames were identified, each frame was defined. Finally, the discovered Chinese frames were linked according to the eight types of frame relations proposed by Ruppenhofer et al. (2016). To be displayed online, term entries and semantic frames were encoded in XML files. Guided by this research methodology, we eventually discovered and defined 23 Chinese semantic frames. The end result of this research is a frame-based Mandarin Chinese terminological resource specialised in the field of climate change. This terminological resource consists of two parts. The first part is the description of a total of 39 Chinese verb terms. With each meaning of a polysemous verb term placed in a separate entry, there are a total of 59 entries (each entry contains the actantial structure and annotated contexts). A total of 1,027 contexts were annotated. The second part of this resource presents the 23 Chinese semantic frames identified as well as the relations between frames
    • …
    corecore