4,221 research outputs found
Methods to Facilitate the Capture, Use, and Reuse of Structured and Unstructured Clinical Data.
Electronic health records (EHRs) have great potential to improve quality of care and to support clinical and translational research. While EHRs are being increasingly implemented in U.S. hospitals and clinics, their anticipated benefits have been largely unachieved or underachieved. Among many factors, tedious documentation requirements and the lack of effective information retrieval tools to access and reuse data are two key reasons accounting for this deficiency. In this dissertation, I describe my research on developing novel methods to facilitate the capture, use, and reuse of both structured and unstructured clinical data.
Specifically, I develop a framework to investigate potential issues in this research topic, with a focus on three significant challenges. The first challenge is structured data entry (SDE), which can be facilitated by four effective strategies based on my systematic review. I further propose a multi-strategy model to guide the development of future SDE applications. In the follow-up study, I focus on workflow integration and evaluate the feasibility of using EHR audit trail logs for clinical workflow analysis. The second challenge is the use of clinical narratives, which can be supported by my innovative information retrieval (IR) technique called “semantically-based query recommendation (SBQR)”. My user experiment shows that SBQR can help improve the perceived performance of a medical IR system, and may work better on search tasks with average difficulty. The third challenge involves reusing EHR data as a reference standard to benchmark the quality of other health-related information. My study assesses the readability of trial descriptions on ClinicalTrials.gov and found that trial descriptions are very hard to read, even harder than clinical notes.
My dissertation has several contributions. First, it conducts pioneer studies with innovative methods to improve the capture, use, and reuse of clinical data. Second, my dissertation provides successful examples for investigators who would like to conduct interdisciplinary research in the field of health informatics. Third, the framework of my research can be a great tool to generate future research agenda in clinical documentation and EHRs. I will continue exploring innovative and effective methods to maximize the value of EHRs.PHDInformationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/135845/1/tzuyu_1.pd
Inferring Strategies for Sentence Ordering in Multidocument News Summarization
The problem of organizing information for multidocument summarization so that
the generated summary is coherent has received relatively little attention.
While sentence ordering for single document summarization can be determined
from the ordering of sentences in the input article, this is not the case for
multidocument summarization where summary sentences may be drawn from different
input articles. In this paper, we propose a methodology for studying the
properties of ordering information in the news genre and describe experiments
done on a corpus of multiple acceptable orderings we developed for the task.
Based on these experiments, we implemented a strategy for ordering information
that combines constraints from chronological order of events and topical
relatedness. Evaluation of our augmented algorithm shows a significant
improvement of the ordering over two baseline strategies
Text Summarization Techniques: A Brief Survey
In recent years, there has been a explosion in the amount of text data from a
variety of sources. This volume of text is an invaluable source of information
and knowledge which needs to be effectively summarized to be useful. In this
review, the main approaches to automatic text summarization are described. We
review the different processes for summarization and describe the effectiveness
and shortcomings of the different methods.Comment: Some of references format have update
Recommended from our members
Computational Approaches to Assisting Patients\u27 Medical Comprehension from Electronic Health Records
Patient-centered care has been established as a fundamental approach to improve the quality of health care in a seminal report by the Institute of Medicine published at the start of the century. Improved access to health information and demand for greater transparency contributed to its move into the mainstream. Research has also demonstrated that actively involving patients in the management of their own health can lead to better outcomes, and potentially lower costs. However, despite the efforts in many areas of medicine to embrace patient-centered care, engaging patients is still considered a challenge. One of the barriers is the lack of effective tools to help patients understand their health conditions, options and their consequences.
Patient portals are now widely adopted by hospitals and other healthcare practices to provide patients with the capabilities to view their own Electronic Health Records. They are a rich resource of information for patients. However, the language in the records are generally difficult for patients without training in medicine to understand. Furthermore, the amount of information can often be overwhelming as well. In this work, we propose computational approaches to foster patient engagement from three aspects by exploiting the rich information in the medical records.
First, we design a framework to automatically generate health literacy instruments to measure a patient\u27s literacy levels. This framework exploits readily available large scale corpora to generate instruments in a commonly used test format. Second, we investigate methods that can determine the readability of complex documents such as health records. We propose to rank document readability, instead of assigning a grade level or a pre-defined difficulty category. Lastly, we examine the problem of finding targeted educational materials to facilitate patient comprehension of medical notes. We study methods to formulate effective queries from specialized and long clinical narratives. In addition, we propose a neural network based method to identify medical concepts that are important to patients.
The three aspects of this work address the issues of the overabundance and technical complexity of medical language in health records. We demonstrate that our approaches are effective with various experiments and evaluation metric
A matter of words: NLP for quality evaluation of Wikipedia medical articles
Automatic quality evaluation of Web information is a task with many fields of
applications and of great relevance, especially in critical domains like the
medical one. We move from the intuition that the quality of content of medical
Web documents is affected by features related with the specific domain. First,
the usage of a specific vocabulary (Domain Informativeness); then, the adoption
of specific codes (like those used in the infoboxes of Wikipedia articles) and
the type of document (e.g., historical and technical ones). In this paper, we
propose to leverage specific domain features to improve the results of the
evaluation of Wikipedia medical articles. In particular, we evaluate the
articles adopting an "actionable" model, whose features are related to the
content of the articles, so that the model can also directly suggest strategies
for improving a given article quality. We rely on Natural Language Processing
(NLP) and dictionaries-based techniques in order to extract the bio-medical
concepts in a text. We prove the effectiveness of our approach by classifying
the medical articles of the Wikipedia Medicine Portal, which have been
previously manually labeled by the Wiki Project team. The results of our
experiments confirm that, by considering domain-oriented features, it is
possible to obtain sensible improvements with respect to existing solutions,
mainly for those articles that other approaches have less correctly classified.
Other than being interesting by their own, the results call for further
research in the area of domain specific features suitable for Web data quality
assessment
Assessing the Readability of Medical Documents: A Ranking Approach
BACKGROUND: The use of electronic health record (EHR) systems with patient engagement capabilities, including viewing, downloading, and transmitting health information, has recently grown tremendously. However, using these resources to engage patients in managing their own health remains challenging due to the complex and technical nature of the EHR narratives.
OBJECTIVE: Our objective was to develop a machine learning-based system to assess readability levels of complex documents such as EHR notes.
METHODS: We collected difficulty ratings of EHR notes and Wikipedia articles using crowdsourcing from 90 readers. We built a supervised model to assess readability based on relative orders of text difficulty using both surface text features and word embeddings. We evaluated system performance using the Kendall coefficient of concordance against human ratings.
RESULTS: Our system achieved significantly higher concordance (.734) with human annotators than did a baseline using the Flesch-Kincaid Grade Level, a widely adopted readability formula (.531). The improvement was also consistent across different disease topics. This method\u27s concordance with an individual human user\u27s ratings was also higher than the concordance between different human annotators (.658).
CONCLUSIONS: We explored methods to automatically assess the readability levels of clinical narratives. Our ranking-based system using simple textual features and easy-to-learn word embeddings outperformed a widely used readability formula. Our ranking-based method can predict relative difficulties of medical documents. It is not constrained to a predefined set of readability levels, a common design in many machine learning-based systems. Furthermore, the feature set does not rely on complex processing of the documents. One potential application of our readability ranking is personalization, allowing patients to better accommodate their own background knowledge
Assessing the Readability of Medical Documents: A Ranking Approach
BACKGROUND: The use of electronic health record (EHR) systems with patient engagement capabilities, including viewing, downloading, and transmitting health information, has recently grown tremendously. However, using these resources to engage patients in managing their own health remains challenging due to the complex and technical nature of the EHR narratives.
OBJECTIVE: Our objective was to develop a machine learning-based system to assess readability levels of complex documents such as EHR notes.
METHODS: We collected difficulty ratings of EHR notes and Wikipedia articles using crowdsourcing from 90 readers. We built a supervised model to assess readability based on relative orders of text difficulty using both surface text features and word embeddings. We evaluated system performance using the Kendall coefficient of concordance against human ratings.
RESULTS: Our system achieved significantly higher concordance (.734) with human annotators than did a baseline using the Flesch-Kincaid Grade Level, a widely adopted readability formula (.531). The improvement was also consistent across different disease topics. This method\u27s concordance with an individual human user\u27s ratings was also higher than the concordance between different human annotators (.658).
CONCLUSIONS: We explored methods to automatically assess the readability levels of clinical narratives. Our ranking-based system using simple textual features and easy-to-learn word embeddings outperformed a widely used readability formula. Our ranking-based method can predict relative difficulties of medical documents. It is not constrained to a predefined set of readability levels, a common design in many machine learning-based systems. Furthermore, the feature set does not rely on complex processing of the documents. One potential application of our readability ranking is personalization, allowing patients to better accommodate their own background knowledge
- …