21 research outputs found
Design considerations for a hierarchical semantic compositional framework for medical natural language understanding
Medical natural language processing (NLP) systems are a key enabling
technology for transforming Big Data from clinical report repositories to
information used to support disease models and validate intervention methods.
However, current medical NLP systems fall considerably short when faced with
the task of logically interpreting clinical text. In this paper, we describe a
framework inspired by mechanisms of human cognition in an attempt to jump the
NLP performance curve. The design centers about a hierarchical semantic
compositional model (HSCM) which provides an internal substrate for guiding the
interpretation process. The paper describes insights from four key cognitive
aspects including semantic memory, semantic composition, semantic activation,
and hierarchical predictive coding. We discuss the design of a generative
semantic model and an associated semantic parser used to transform a free-text
sentence into a logical representation of its meaning. The paper discusses
supportive and antagonistic arguments for the key features of the architecture
as a long-term foundational framework
Recommended from our members
Design considerations for a hierarchical semantic compositional framework for medical natural language understanding.
Medical natural language processing (NLP) systems are a key enabling technology for transforming Big Data from clinical report repositories to information used to support disease models and validate intervention methods. However, current medical NLP systems fall considerably short when faced with the task of logically interpreting clinical text. In this paper, we describe a framework inspired by mechanisms of human cognition in an attempt to jump the NLP performance curve. The design centers on a hierarchical semantic compositional model (HSCM), which provides an internal substrate for guiding the interpretation process. The paper describes insights from four key cognitive aspects: semantic memory, semantic composition, semantic activation, and hierarchical predictive coding. We discuss the design of a generative semantic model and an associated semantic parser used to transform a free-text sentence into a logical representation of its meaning. The paper discusses supportive and antagonistic arguments for the key features of the architecture as a long-term foundational framework
Recommended from our members
A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts
Clinical case reports (CCRs) are a valuable means of sharing observations and insights in medicine. The form of these documents varies, and their content includes descriptions of numerous, novel disease presentations and treatments. Thus far, the text data within CCRs is largely unstructured, requiring significant human and computational effort to render these data useful for in-depth analysis. In this protocol, we describe methods for identifying metadata corresponding to specific biomedical concepts frequently observed within CCRs. We provide a metadata template as a guide for document annotation, recognizing that imposing structure on CCRs may be pursued by combinations of manual and automated effort. The approach presented here is appropriate for organization of concept-related text from a large literature corpus (e.g., thousands of CCRs) but may be easily adapted to facilitate more focused tasks or small sets of reports. The resulting structured text data includes sufficient semantic context to support a variety of subsequent text analysis workflows: meta-analyses to determine how to maximize CCR detail, epidemiological studies of rare diseases, and the development of models of medical language may all be made more realizable and manageable through the use of structured text data
Recommended from our members
A Metadata Extraction Approach for Clinical Case Reports to Enable Advanced Understanding of Biomedical Concepts
Clinical case reports (CCRs) are a valuable means of sharing observations and insights in medicine. The form of these documents varies, and their content includes descriptions of numerous, novel disease presentations and treatments. Thus far, the text data within CCRs is largely unstructured, requiring significant human and computational effort to render these data useful for in-depth analysis. In this protocol, we describe methods for identifying metadata corresponding to specific biomedical concepts frequently observed within CCRs. We provide a metadata template as a guide for document annotation, recognizing that imposing structure on CCRs may be pursued by combinations of manual and automated effort. The approach presented here is appropriate for organization of concept-related text from a large literature corpus (e.g., thousands of CCRs) but may be easily adapted to facilitate more focused tasks or small sets of reports. The resulting structured text data includes sufficient semantic context to support a variety of subsequent text analysis workflows: meta-analyses to determine how to maximize CCR detail, epidemiological studies of rare diseases, and the development of models of medical language may all be made more realizable and manageable through the use of structured text data
MACCRs.tsv
This file contains 3,100 sets of metadata extracted from clinical case reports. Each metadata record includes information identifying the source report, text corresponding to high-level medical concepts, and funding details
Metadata Extraction Guide
This file provides a guide to the process performed in assembly of the Metadata Acquired from Clinical Case Reports (MACCR) data set
MACCR_RMD_ICD10_Categories
This file contains a set of scores indicating presence of ICD-10-CM codes, grouped into categories, as determined by a panel of domain experts reading clinical case reports describing presentations of rare mitochondrial diseases. These reports are a subset of those used in assembly of the MACCR set. Each row represents a single report, while each column contains a value of 0 (denoting the material corresponding to any code in the category named in the header was not observed) or 1 (denoting at least once code for material within the category named in the header was described in the report text). Reports are identified using their PubMed IDs
MACCR_RMD_ICD10
This file contains a set of scores indicating presence of ICD-10-CM codes, as determined by a panel of domain experts reading clinical case reports describing presentations of rare mitochondrial diseases. These reports are a subset of those used in assembly of the MACCR set. Each row represents a single report, while each column contains a value of 0 (denoting the material corresponding to the code in the header was not observed) or 1 (denoting material corresponding to the code was described in the report text). Reports are identified using their PubMed IDs