Search CORE

3,202 research outputs found

Automated Exercise Generation in Mobile Language Learning

Author: Verweij Rayo
Publication venue: Bard Digital Commons
Publication date: 01/01/2020
Field of study

The Language Lion is an Android application that teaches basic Dutch to English speakers. While mobile language learning has increased exponentially in popularity, course creation is still labor-intensive. By contrast, the Language Lion uses a map of Dutch to English lexemes, a context-free grammar, and a modified version of the SimpleNLG sentence realizer to automatically generate semi-random translation exercises for the student. Each component is evaluated individually to find and analyze the particular roadblocks in automated exercise generation for mobile language learning

Bard College

Developing a manually annotated clinical document corpus to identify phenotypic information for inflammatory bowel disease

Author: A Roberts
Adi V Gundlapalli
AV Gundlapalli
Brett R South
CR Weir
EM Fielstein
G Hripcsak
G Hripcsak
HJ Tange
Jennifer Garvin
JF Penz
JJ Cimino
MA Musen
Makoto Jones
Matthew H Samore
PV Ogren
RH Dolin
S Brown
SH Brown
SH Brown
Shuying Shen
SM Meystre
V Kashyap
W Chapman
Wendy W Chapman
WW Chapman
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Natural Language Processing (NLP) systems can be used for specific Information Extraction (IE) tasks such as extracting phenotypic data from the electronic medical record (EMR). These data are useful for translational research and are often found only in free text clinical notes. A key required step for IE is the manual annotation of clinical corpora and the creation of a reference standard for (1) training and validation tasks and (2) to focus and clarify NLP system requirements. These tasks are time consuming, expensive, and require considerable effort on the part of human reviewers. Methods Using a set of clinical documents from the VA EMR for a particular use case of interest we identify specific challenges and present several opportunities for annotation tasks. We demonstrate specific methods using an open source annotation tool, a customized annotation schema, and a corpus of clinical documents for patients known to have a diagnosis of Inflammatory Bowel Disease (IBD). We report clinician annotator agreement at the document, concept, and concept attribute level. We estimate concept yield in terms of annotated concepts within specific note sections and document types. Results Annotator agreement at the document level for documents that contained concepts of interest for IBD using estimated Kappa statistic (95% CI) was very high at 0.87 (0.82, 0.93). At the concept level, F-measure ranged from 0.61 to 0.83. However, agreement varied greatly at the specific concept attribute level. For this particular use case (IBD), clinical documents producing the highest concept yield per document included GI clinic notes and primary care notes. Within the various types of notes, the highest concept yield was in sections representing patient assessment and history of presenting illness. Ancillary service documents and family history and plan note sections produced the lowest concept yield. Conclusion Challenges include defining and building appropriate annotation schemas, adequately training clinician annotators, and determining the appropriate level of information to be annotated. Opportunities include narrowing the focus of information extraction to use case specific note types and sections, especially in cases where NLP systems will be used to extract information from large repositories of electronic clinical note documents.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

University of Melbourne Institutional Repository

Generating varied narrative probability exercises

Author: Akker Rieks op den
Boer Rookhuiszen Roan
Geerlings Hanneke
Theune Mariët
Publication venue: Association for Computational Linguistics
Publication date: 01/01/2011
Field of study

This paper presents Genpex, a system for automatic generation of narrative probability exercises. Generation of exercises in Genpex is done in two steps. First, the system creates a specification of a solvable probability problem, based on input from the user (a researcher or test developer) who selects a specific question type and a narrative context for the problem. Then, a text expressing the probability problem is generated. The user can tune the generated text by setting the values of some linguistic variation parameters. By varying the mathematical content of the exercise, its narrative context and the linguistic parameter settings, many different exercises can be produced. Here we focus on the natural language generation part of Genpex. After describing how the system works, we briefly present our first evaluation results, and discuss some aspects requiring further investigation

University of Twente Research Information

An Overview of Current Research on Automated Essay Grading

Author: Alessandro Cucchiarelli
Francesca Neri
Salvatore Valenti
Publication venue: Informing Science Institute
Publication date: 01/01/2003
Field of study

Directory of Open Access Journals

IRIS UniversitÃ Politecnica delle Marche

Spoken dialogue systems: architectures and applications

Author: Olaso Fernández Javier Mikel
Publication venue
Publication date: 13/11/2017
Field of study

171 p.Technology and technological devices have become habitual and omnipresent. Humans need to learn tocommunicate with all kind of devices. Until recently humans needed to learn how the devices expressthemselves to communicate with them. But in recent times the tendency has become to makecommunication with these devices in more intuitive ways. The ideal way to communicate with deviceswould be the natural way of communication between humans, the speech. Humans have long beeninvestigating and designing systems that use this type of communication, giving rise to the so-calledSpoken Dialogue Systems.In this context, the primary goal of the thesis is to show how these systems can be implemented.Additionally, the thesis serves as a review of the state-of-the-art regarding architectures and toolkits.Finally, the thesis is intended to serve future system developers as a guide for their construction. For that

Archivo Digital para la Docencia y la Investigación

Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective

Author: Luo Yuan
Szolovits Peter
Publication venue
Publication date: 14/11/2018
Field of study

This paper presents a Lisp architecture for a portable NLP system, termed LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard, customized and in-house developed NLP tools. Our system facilitates portability across different institutions and data systems by incorporating an enriched Common Data Model (CDM) to standardize necessary data elements. It utilizes UMLS to perform domain adaptation when integrating generic domain NLP tools. It also features stand-off annotations that are specified by positional reference to the original document. We built an interval tree based search engine to efficiently query and retrieve the stand-off annotations by specifying positional requirements. We also developed a utility to convert an inline annotation format to stand-off annotations to enable the reuse of clinical text datasets with inline annotations. We experimented with our system on several NLP facilitated tasks including computational phenotyping for lymphoma patients and semantic relation extraction for clinical notes. These experiments showcased the broader applicability and utility of LAPNLP.Comment: 6 pages, accepted by IEEE BIBM 2018 as regular pape

arXiv.org e-Print Archive

DSpace@MIT

Crossref