62 research outputs found

    CML: Evolution and design.

    Get PDF
    A retrospective view of the design and evolution of Chemical Markup Language (CML) is presented by its original authors.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    Report on the 2015 NSF Workshop on Unified Annotation Tooling

    Get PDF
    On March 30 & 31, 2015, an international group of twenty-three researchers with expertise in linguistic annotation convened in Sunny Isles Beach, Florida to discuss problems with and potential solutions for the state of linguistic annotation tooling. The participants comprised 14 researchers from the U.S. and 9 from outside the U.S., with 7 countries and 4 continents represented, and hailed from fields and specialties including computational linguistics, artificial intelligence, speech processing, multi-modal data processing, clinical & medical natural language processing, linguistics, documentary linguistics, sign-language linguistics, corpus linguistics, and the digital humanities. The motivating problem of the workshop was the balkanization of annotation tooling, namely, that even though linguistic annotation requires sophisticated tool support to efficiently generate high-quality data, the landscape of tools for the field is fractured, incompatible, inconsistent, and lacks key capabilities. The overall goal of the workshop was to chart the way forward, centering on five key questions: (1) What are the problems with current tool landscape? (2) What are the possible benefits of solving some or all of these problems? (3) What capabilities are most needed? (4) How should we go about implementing these capabilities? And, (5) How should we ensure longevity and sustainability of the solution? I surveyed the participants before their arrival, which provided significant raw material for ideas, and the workshop discussion itself resulted in identification of ten specific classes of problems, five sets of most-needed capabilities. Importantly, we identified annotation project managers in computational linguistics as the key recipients and users of any solution, thereby succinctly addressing questions about the scope and audience of potential solutions. We discussed management and sustainability of potential solutions at length. The participants agreed on sixteen recommendations for future work. This technical report contains a detailed discussion of all these topics, a point-by-point review of the discussion in the workshop as it unfolded, detailed information on the participants and their expertise, and the summarized data from the surveys

    Extractive Summarization : Experimental work on nursing notes in Finnish

    Get PDF
    Natural Language Processing (NLP) is a subfield of artificial intelligence and linguistics that is concerned with how a computer machine interacts with human language. With the increasing computational power and the advancement in technologies, researchers have been successful at proposing various NLP tasks that have already been implemented as real-world applications today. Automated text summarization is one of the many tasks that has not yet completely matured particularly in health sector. A success in this task would enable healthcare professionals to grasp patient's history in a minimal time resulting in faster decisions required for better care. Automatic text summarization is a process that helps shortening a large text without sacrificing important information. This could be achieved by paraphrasing the content known as the abstractive method or by concatenating relevant extracted sentences namely the extractive method. In general, this process requires the conversion of text into numerical form and then a method is executed to identify and extract relevant text. This thesis is an attempt of exploring NLP techniques used in extractive text summarization particularly in health domain. The work includes a comparison of basic summarizing models implemented on a corpus of patient notes written by nurses in Finnish language. Concepts and research studies required to understand the implementation have been documented along with the description of the code. A python-based project is structured to build a corpus and execute multiple summarizing models. For this thesis, we observe the performance of two textual embeddings namely Term Frequency - Inverse Document Frequency (TF-IDF) which is based on simple statistical measure and Word2Vec which is based on neural networks. For both models, LexRank, an unsupervised stochastic graph-based sentence scoring algorithm, is used for sentence extraction and a random selection method is used as a baseline method for evaluation. To evaluate and compare the performance of models, summaries of 15 patient care episodes of each model were provided to two human beings for manual evaluations. According to the results of the small sample dataset, we observe that both evaluators seem to agree with each other in preferring summaries produced by Word2Vec LexRank over the summaries generated by TF-IDF LexRank. Both models have also been observed, by both evaluators, to perform better than the baseline model of random selection

    Universal Functional Requisites of Society: The Unending Quest

    Get PDF

    A history and theory of textual event detection and recognition

    Get PDF

    Universal Functional Requisites of Society: The Unending Quest

    Get PDF
    • …
    corecore