Search CORE

1,343 research outputs found

Using data-driven sublanguage pattern mining to induce knowledge models: application in medical image reports knowledge representation

Author: Fesharaki Nooshin J.
Liu Hongfang
Luo Jake
Zhao Yiqing
Publication venue: UWM Digital Commons
Publication date: 01/01/2018
Field of study

Background: The use of knowledge models facilitates information retrieval, knowledge base development, and therefore supports new knowledge discovery that ultimately enables decision support applications. Most existing works have employed machine learning techniques to construct a knowledge base. However, they often suffer from low precision in extracting entity and relationships. In this paper, we described a data-driven sublanguage pattern mining method that can be used to create a knowledge model. We combined natural language processing (NLP) and semantic network analysis in our model generation pipeline. Methods: As a use case of our pipeline, we utilized data from an open source imaging case repository, Radiopaedia.org, to generate a knowledge model that represents the contents of medical imaging reports. We extracted entities and relationships using the Stanford part-of-speech parser and the “Subject:Relationship:Object” syntactic data schema. The identified noun phrases were tagged with the Unified Medical Language System (UMLS) semantic types. An evaluation was done on a dataset comprised of 83 image notes from four data sources. Results: A semantic type network was built based on the co-occurrence of 135 UMLS semantic types in 23,410 medical image reports. By regrouping the semantic types and generalizing the semantic network, we created a knowledge model that contains 14 semantic categories. Our knowledge model was able to cover 98% of the content in the evaluation corpus and revealed 97% of the relationships. Machine annotation achieved a precision of 87%, recall of 79%, and F-score of 82%. Conclusion: The results indicated that our pipeline was able to produce a comprehensive content-based knowledge model that could represent context from various sources in the same domain

University of Wisconsin-Milwaukee

Information Technology and Computer Science

Author: Janet O Olaleke
Olaronke G Iroju
Publication venue
Publication date: 12/02/2020
Field of study

Abstract- The healthcare system is a knowledge driven industry which consists of vast and growing volumes of narrative information obtained from discharge summaries/reports, physicians case notes, pathologists as well as radiologists reports. This information is usually stored in unstructured and non-standardized formats in electronic healthcare systems which make it difficult for the systems to understand the information contents of the narrative information. Thus, the access to valuable and meaningful healthcare information for decision making is a challenge. Nevertheless, Natural Language Processing (NLP) techniques have been used to structure narrative information in healthcare. Thus, NLP techniques have the capability to capture unstructured healthcare information, analyze its grammatical structure, determine the meaning of the information and translate the information so that it can be easily understood by the electronic healthcare systems. Consequently, NLP techniques reduce cost as well as improve the quality of healthcare. It is therefore against this background that this paper reviews the NLP techniques used in healthcare, their applications as well as their limitations

CiteSeerX

Doctor of Philosophy

Author: Kalsy Megha
Publication venue: University of Utah
Publication date: 01/01/2015
Field of study

DissertationHealth information technology (HIT) in conjunction with quality improvement (QI) methodologies can promote higher quality care at lower costs. Unfortunately, most inpatient hospital settings have been slow to adopt HIT and QI methodologies. Successful adoption requires close attention to workflow. Workflow is the sequence of tasks, processes, and the set of people or resources needed for those tasks that are necessary to accomplish a given goal. Assessing the impact on workflow is an important component of determining whether a HIT implementation will be successful, but little research has been conducted on the impact of eMeasure (electronic performance measure) implementation on workflow. One solution to addressing implementation challenges such as the lack of attention to workflow is an implementation toolkit. An implementation toolkit is an assembly of instruments such as checklists, forms, and planning documents. We developed an initial eMeasure Implementation Toolkit for the heart failure (HF) eMeasure to allow QI and information technology (IT) professionals and their team to assess the impact of implementation on workflow. During the development phase of the toolkit, we undertook a literature review to determine the components of the toolkit. We conducted stakeholder interviews with HIT and QI key informants and subject matter experts (SMEs) at the US Department of Veteran Affairs (VA). Key informants provided a broad understanding about the context of workflow during eMeasure implementation. Based on snowball sampling, we also interviewed other SMEs based on the recommendations of the key informants who suggested tools and provided information essential to the toolkit development. The second phase involved evaluation of the toolkit for relevance and clarity, by experts in non-VA settings. The experts evaluated the sections of the toolkit that contained the tools, via a survey. The final toolkit provides a distinct set of resources and tools, which were iteratively developed during the research and available to users in a single source document. The research methodology provided a strong unified overarching implementation framework in the form of the Promoting Action on Research Implementation in Health Services (PARIHS) model in combination with a sociotechnical model of HIT that strengthened the overall design of the study

The University of Utah: J. Willard Marriott Digital Library

Automated Transformation of Semi-Structured Text Elements

Author: Fenz Stefan
Heurix Johannes
Neubauer Thomas
Rella Antonio
Publication venue: AIS Electronic Library (AISeL)
Publication date: 30/07/2012
Field of study

Interconnected systems, such as electronic health records (EHR), considerably improved the handling and processing of health information while keeping the costs at a controlled level. Since the EHR virtually stores all data in digitized form, personal medical documents are easily and swiftly available when needed. However, multiple formats and differences in the health documents managed by various health care providers severely reduce the efficiency of the data sharing process. This paper presents a rule-based transformation system that converts semi-structured (annotated) text into standardized formats, such as HL7 CDA. It identifies relevant information in the input document by analyzing its structure as well as its content and inserts the required elements into corresponding reusable CDA templates, where the templates are selected according to the CDA document type-specific requirements

AIS Electronic Library (AISeL)

Enhancing rule-based text classification of neurosurgical notes using filtered feature weight vectors

Author: Burstein Frada
Delir Haghighi Pari
Khademi Sedigheh
Palmer Christopher
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2016
Field of study

Clinicians need to record clinical encounters in written or spoken language, not only for its work-flow naturalness but also for its expressivity, precision, and capacity to convey all required information, which codified structure data is incapable of. Therefore, the structured data which is required for aggregation and analysis must be obtained from clinical text as a later step. Specialised areas of medicine use their own clinical language and clinical coding systems, resulting in unique challenges for the extraction process. Rule-based information extraction have been used effectively in commercial systems and are favoured because they are easily understood and controlled. However, there is promising research into the use of machine language techniques for extracting information, and this research explores the effectiveness of a hybrid rule-based and machine learning-based audit coding system developed for the neurosurgical department of a major trauma hospital

Research Online

Monash University Research Portal

AIS Electronic Library (AISeL)

Evaluation of a novel terminology to categorize clinical document section headers and a related clinical note section tagger

Author: Denny Joshua C
Publication venue: VANDERBILT
Publication date
Field of study

Intelligent audit code generation from free text in the context of neurosurgery

Author: Burstein Frada
Haghighi Pari Delir
Khademi Sedigheh
Lewis Philip
Palmer Christopher
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2015
Field of study

Clinical auditing requires codified data for aggregation and analysis of patterns. However in the medical domain obtaining structured data can be difficult as the most natural, expressive and comprehensive way to record a clinical encounter is through natural language. The task of creating structured data from naturally expressed information is known as information extraction. Specialised areas of medicine use their own language and data structures; the translation process has unique challenges, and often requires a fresh approach. This research is devoted to creating a novel semi-automated method for generating codified auditing data from clinical notes recorded in a neurosurgical department in an Australian teaching hospital. The method encapsulates specialist knowledge in rules that instantaneously make precise decisions for the majority of the matches, followed up by dictionary-based matching of the remaining text

arXiv.org e-Print Archive

AIS Electronic Library (AISeL)

Recommended from our members

Addressing Semantic Interoperability and Text Annotations. Concerns in Electronic Health Records using Word Embedding, Ontology and Analogy

Author: Naveed Arjmand
Publication venue: Faculty of Engineering and Informatics. School of Electrical Engineering and Computer Science
Publication date: 01/01/2021
Field of study

Electronic Health Record (EHR) creates a huge number of databases which are being updated dynamically. Major goal of interoperability in healthcare is to facilitate the seamless exchange of healthcare related data and an environment to supports interoperability and secure transfer of data. The health care organisations face difficulties in exchanging patient’s health care information and laboratory reports etc. due to a lack of semantic interoperability. Hence, there is a need of semantic web technologies for addressing healthcare interoperability problems by enabling various healthcare standards from various healthcare entities (doctors, clinics, hospitals etc.) to exchange data and its semantics which can be understood by both machines and humans. Thus, a framework with a similarity analyser has been proposed in the thesis that dealt with semantic interoperability. While dealing with semantic interoperability, another consideration was the use of word embedding and ontology for knowledge discovery. In medical domain, the main challenge for medical information extraction system is to find the required information by considering explicit and implicit clinical context with high degree of precision and accuracy. For semantic similarity of medical text at different levels (conceptual, sentence and document level), different methods and techniques have been widely presented, but I made sure that the semantic content of a text that is presented includes the correct meaning of words and sentences. A comparative analysis of approaches included ontology followed by word embedding or vice-versa have been applied to explore the methodology to define which approach gives better results for gaining higher semantic similarity. Selecting the Kidney Cancer dataset as a use case, I concluded that both approaches work better in different circumstances. However, the approach in which ontology is followed by word embedding to enrich data first has shown better results. Apart from enriching the EHR, extracting relevant information is also challenging. To solve this challenge, the concept of analogy has been applied to explain similarities between two different contents as analogies play a significant role in understanding new concepts. The concept of analogy helps healthcare professionals to communicate with patients effectively and help them understand their disease and treatment. So, I utilised analogies in this thesis to support the extraction of relevant information from the medical text. Since accessing EHR has been challenging, tweets text is used as an alternative for EHR as social media has appeared as a relevant data source in recent years. An algorithm has been proposed to analyse medical tweets based on analogous words. The results have been used to validate the proposed methods. Two experts from medical domain have given their views on the proposed methods in comparison with the similar method named as SemDeep. The quantitative and qualitative results have shown that the proposed analogy-based method bring diversity and are helpful in analysing the specific disease or in text classification

Bradford Scholars

COHORT IDENTIFICATION FROM FREE-TEXT CLINICAL NOTES USING SNOMED CT’S SEMANTIC RELATIONS

Author: Chang Eunsuk
Publication venue: University of North Carolina at Chapel Hill Graduate School
Publication date: 01/01/2022
Field of study

In this paper, a new cohort identification framework that exploits the semantic hierarchy of SNOMED CT is proposed to overcome the limitations of supervised machine learning-based approaches. Eligibility criteria descriptions and free-text clinical notes from the 2018 National NLP Clinical Challenge (n2c2) were processed to map to relevant SNOMED CT concepts and to measure semantic similarity between the eligibility criteria and patients. The eligibility of a patient was determined if the patient had a similarity score higher than a threshold cut-off value, which was established where the best F1 score could be achieved. The performance of the proposed system was evaluated for three eligibility criteria. The current framework’s macro-average F1 score across three eligibility criteria was higher than the previously reported results of the 2018 n2c2 (0.933 vs. 0.889). This study demonstrated that SNOMED CT alone can be leveraged for cohort identification tasks without referring to external textual sources for training.Doctor of Philosoph

Carolina Digital Repository

Doctor of Philosophy

Author: Redd Douglas Fletcher
Publication venue: University of Utah
Publication date: 01/05/2016
Field of study

dissertationElectronic Health Records (EHRs) provide a wealth of information for secondary uses. Methods are developed to improve usefulness of free text query and text processing and demonstrate advantages to using these methods for clinical research, specifically cohort identification and enhancement. Cohort identification is a critical early step in clinical research. Problems may arise when too few patients are identified, or the cohort consists of a nonrepresentative sample. Methods of improving query formation through query expansion are described. Inclusion of free text search in addition to structured data search is investigated to determine the incremental improvement of adding unstructured text search over structured data search alone. Query expansion using topic- and synonym-based expansion improved information retrieval performance. An ensemble method was not successful. The addition of free text search compared to structured data search alone demonstrated increased cohort size in all cases, with dramatic increases in some. Representation of patients in subpopulations that may have been underrepresented otherwise is also shown. We demonstrate clinical impact by showing that a serious clinical condition, scleroderma renal crisis, can be predicted by adding free text search. A novel information extraction algorithm is developed and evaluated (Regular Expression Discovery for Extraction, or REDEx) for cohort enrichment. The REDEx algorithm is demonstrated to accurately extract information from free text clinical iv narratives. Temporal expressions as well as bodyweight-related measures are extracted. Additional patients and additional measurement occurrences are identified using these extracted values that were not identifiable through structured data alone. The REDEx algorithm transfers the burden of machine learning training from annotators to domain experts. We developed automated query expansion methods that greatly improve performance of keyword-based information retrieval. We also developed NLP methods for unstructured data and demonstrate that cohort size can be greatly increased, a more complete population can be identified, and important clinical conditions can be detected that are often missed otherwise. We found a much more complete representation of patients can be obtained. We also developed a novel machine learning algorithm for information extraction, REDEx, that efficiently extracts clinical values from unstructured clinical text, adding additional information and observations over what is available in structured text alone

The University of Utah: J. Willard Marriott Digital Library