4 research outputs found
Recommended from our members
Electronic Health Record Summarization over Heterogeneous and Irregularly Sampled Clinical Data
The increasing adoption of electronic health records (EHRs) has led to an unprecedented amount of patient health information stored in an electronic format. The ability to comb through this information is imperative, both for patient care and computational modeling. Creating a system to minimize unnecessary EHR data, automatically distill longitudinal patient information, and highlight salient parts of a patient’s record is currently an unmet need. However, summarization of EHR data is not a trivial task, as there exist many challenges with reasoning over this data. EHR data elements are most often obtained at irregular intervals as patients are more likely to receive medical care when they are ill, than when they are healthy. The presence of narrative documentation adds another layer of complexity as the notes are riddled with over-sampled text, often caused by the frequent copy-and-pasting during the documentation process.
This dissertation synthesizes a set of challenges for automated EHR summarization identified in the literature and presents an array of methods for dealing with some of these challenges. We used hybrid data-driven and knowledge-based approaches to examine abundant redundancy in clinical narrative text, a data-driven approach to identify and mitigate biases in laboratory testing patterns with implications for using clinical data for research, and a probabilistic modeling approach to automatically summarize patient records and learn computational models of disease with heterogeneous data types. The dissertation also demonstrates two applications of the developed methods to important clinical questions: the questions of laboratory test overutilization and cohort selection from EHR data
Domain-sensitive topic management in a modular conversational agent framework
Flexible nontask-oriented conversational agents require content for generating responses and mechanisms that serve them for choosing appropriate topics to drive interactions with users. Structured knowledge resources such as ontologies are a useful mechanism to represent conversational topics. In order to develop the topic-management mechanism, we addressed a number of research issues related to the development of the required infrastructure. First, we address the issue of heavy human involvement in the construction of knowledge resources by proposing a four-stage automatic process for building domain-specific ontologies. These ontologies are comprised of a set of subtaxonomies obtained from WordNet, an electronic dictionary that arranges concepts in a hierarchical structure. The roots of these subtaxonomies are obtained from Wikipedia’s article links or wikilinks; this under the hypothesis that wikilinks provide a sense of relatedness from the article consulted to their destinations. With the knowledge structures defined, we explore the possibility of using semantic relatedness over these domain-specific ontologies as a mean to propose conversational topics in a coherent manner. For this, we examine different automatic measures of semantic relatedness to determine which correlates with human judgements obtained from an automatically constructed dataset. We then examine the question of whether domain information influences the human perception of semantic relatedness in a way that automatic measures do not replicate. This study requires us to design and implement a process to build datasets with pairs of concepts as those used in the literature to evaluate automatic measures of semantic relatedness, but with domain information associated. This study shows, to statistical significance, that existing measures of semantic relatedness do not take domain into consideration, and that including domain as a factor in this calculation can enhance the agreement of automatic measures with human assessments. Finally, this artificially constructed measure is integrated into the Toy’s dialogue manager, in order to help in the real-time selection of conversational topics. This supplements our result that the use of semantic relatedness seems to produce more coherent and interesting topic transitions than existing mechanisms