120 research outputs found

    Improving Syntactic Parsing of Clinical Text Using Domain Knowledge

    Get PDF
    Syntactic parsing is one of the fundamental tasks of Natural Language Processing (NLP). However, few studies have explored syntactic parsing in the medical domain. This dissertation systematically investigated different methods to improve the performance of syntactic parsing of clinical text, including (1) Constructing two clinical treebanks of discharge summaries and progress notes by developing annotation guidelines that handle missing elements in clinical sentences; (2) Retraining four state-of-the-art parsers, including the Stanford parser, Berkeley parser, Charniak parser, and Bikel parser, using clinical treebanks, and comparing their performance to identify better parsing approaches; and (3) Developing new methods to reduce syntactic ambiguity caused by Prepositional Phrase (PP) attachment and coordination using semantic information. Our evaluation showed that clinical treebanks greatly improved the performance of existing parsers. The Berkeley parser achieved the best F-1 score of 86.39% on the MiPACQ treebank. For PP attachment, our proposed methods improved the accuracies of PP attachment by 2.35% on the MiPACQ corpus and 1.77% on the I2b2 corpus. For coordination, our method achieved a precision of 94.9% and a precision of 90.3% for the MiPACQ and i2b2 corpus, respectively. To further demonstrate the effectiveness of the improved parsing approaches, we applied outputs of our parsers to two external NLP tasks: semantic role labeling and temporal relation extraction. The experimental results showed that performance of both tasks’ was improved by using the parse tree information from our optimized parsers, with an improvement of 3.26% in F-measure for semantic role labelling and an improvement of 1.5% in F-measure for temporal relation extraction

    Clinical decision support: Knowledge representation and uncertainty management

    Get PDF
    Programa Doutoral em Engenharia BiomédicaDecision-making in clinical practice is faced with many challenges due to the inherent risks of being a health care professional. From medical error to undesired variations in clinical practice, the mitigation of these issues seems to be tightly connected to the adherence to Clinical Practice Guidelines as evidence-based recommendations The deployment of Clinical Practice Guidelines in computational systems for clinical decision support has the potential to positively impact health care. However, current approaches to Computer-Interpretable Guidelines evidence a set of issues that leave them wanting. These issues are related with the lack of expressiveness of their underlying models, the complexity of knowledge acquisition with their tools, the absence of support to the clinical decision making process, and the style of communication of Clinical Decision Support Systems implementing Computer-Interpretable Guidelines. Such issues pose as obstacles that prevent these systems from showing properties like modularity, flexibility, adaptability, and interactivity. All these properties reflect the concept of living guidelines. The purpose of this doctoral thesis is, thus, to provide a framework that enables the expression of these properties. The modularity property is conferred by the ontological definition of Computer-Interpretable Guidelines and the assistance in guideline acquisition provided by an editing tool, allowing for the management of multiple knowledge patterns that can be reused. Flexibility is provided by the representation primitives defined in the ontology, meaning that the model is adjustable to guidelines from different categories and specialities. On to adaptability, this property is conferred by mechanisms of Speculative Computation, which allow the Decision Support System to not only reason with incomplete information but to adapt to changes of state, such as suddenly knowing the missing information. The solution proposed for interactivity consists in embedding Computer-Interpretable Guideline advice directly into the daily life of health care professionals and provide a set of reminders and notifications that help them to keep track of their tasks and responsibilities. All these solutions make the CompGuide framework for the expression of Clinical Decision Support Systems based on Computer-Interpretable Guidelines.A tomada de decisão na prática clínica enfrenta inúmeros desafios devido aos riscos inerentes a ser um profissional de saúde. Desde o erro medico até às variações indesejadas da prática clínica, a atenuação destes problemas parece estar intimamente ligada à adesão a Protocolos Clínicos, uma vez que estes são recomendações baseadas na evidencia. A operacionalização de Protocolos Clínicos em sistemas computacionais para apoio à decisão clínica apresenta o potencial de ter um impacto positivo nos cuidados de saúde. Contudo, as abordagens atuais a Protocolos Clínicos Interpretáveis por Maquinas evidenciam um conjunto de problemas que as deixa a desejar. Estes problemas estão relacionados com a falta de expressividade dos modelos que lhes estão subjacentes, a complexidade da aquisição de conhecimento utilizando as suas ferramentas, a ausência de suporte ao processo de decisão clínica e o estilo de comunicação dos Sistemas de Apoio à Decisão Clínica que implementam Protocolos Clínicos Interpretáveis por Maquinas. Tais problemas constituem obstáculos que impedem estes sistemas de apresentarem propriedades como modularidade, flexibilidade, adaptabilidade e interatividade. Todas estas propriedades refletem o conceito de living guidelines. O propósito desta tese de doutoramento é, portanto, o de fornecer uma estrutura que possibilite a expressão destas propriedades. A modularidade é conferida pela definição ontológica dos Protocolos Clínicos Interpretáveis por Maquinas e pela assistência na aquisição de protocolos fornecida por uma ferramenta de edição, permitindo assim a gestão de múltiplos padrões de conhecimento que podem ser reutilizados. A flexibilidade é atribuída pelas primitivas de representação definidas na ontologia, o que significa que o modelo é ajustável a protocolos de diferentes categorias e especialidades. Quanto à adaptabilidade, esta é conferida por mecanismos de Computação Especulativa que permitem ao Sistema de Apoio à Decisão não só raciocinar com informação incompleta, mas também adaptar-se a mudanças de estado, como subitamente tomar conhecimento da informação em falta. A solução proposta para a interatividade consiste em incorporar as recomendações dos Protocolos Clínicos Interpretáveis por Maquinas diretamente no dia a dia dos profissionais de saúde e fornecer um conjunto de lembretes e notificações que os auxiliam a rastrear as suas tarefas e responsabilidades. Todas estas soluções constituem a estrutura CompGuide para a expressão de Sistemas de Apoio à Decisão Clínica baseados em Protocolos Clínicos Interpretáveis por Máquinas.The work of the PhD candidate Tiago José Martins Oliveira is supported by a grant from FCT - Fundação para a Ciência e a Tecnologia (Portuguese Foundation for Science and Technology) with the reference SFRH/BD/85291/ 2012

    Data Science and Knowledge Discovery

    Get PDF
    Data Science (DS) is gaining significant importance in the decision process due to a mix of various areas, including Computer Science, Machine Learning, Math and Statistics, domain/business knowledge, software development, and traditional research. In the business field, DS's application allows using scientific methods, processes, algorithms, and systems to extract knowledge and insights from structured and unstructured data to support the decision process. After collecting the data, it is crucial to discover the knowledge. In this step, Knowledge Discovery (KD) tasks are used to create knowledge from structured and unstructured sources (e.g., text, data, and images). The output needs to be in a readable and interpretable format. It must represent knowledge in a manner that facilitates inferencing. KD is applied in several areas, such as education, health, accounting, energy, and public administration. This book includes fourteen excellent articles which discuss this trending topic and present innovative solutions to show the importance of Data Science and Knowledge Discovery to researchers, managers, industry, society, and other communities. The chapters address several topics like Data mining, Deep Learning, Data Visualization and Analytics, Semantic data, Geospatial and Spatio-Temporal Data, Data Augmentation and Text Mining

    Attention is more than prediction precision [Commentary on target article]

    Get PDF
    A cornerstone of the target article is that, in a predictive coding framework, attention can be modelled by weighting prediction error with a measure of precision. We argue that this is not a complete explanation, especially in the light of ERP (event-related potentials) data showing large evoked responses for frequently presented target stimuli, which thus are predicted

    Temporal Information in Data Science: An Integrated Framework and its Applications

    Get PDF
    Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems.Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems

    Geographic information extraction from texts

    Get PDF
    A large volume of unstructured texts, containing valuable geographic information, is available online. This information – provided implicitly or explicitly – is useful not only for scientific studies (e.g., spatial humanities) but also for many practical applications (e.g., geographic information retrieval). Although large progress has been achieved in geographic information extraction from texts, there are still unsolved challenges and issues, ranging from methods, systems, and data, to applications and privacy. Therefore, this workshop will provide a timely opportunity to discuss the recent advances, new ideas, and concepts but also identify research gaps in geographic information extraction

    Generic adaptation framework for unifying adaptive web-based systems

    Get PDF
    The Generic Adaptation Framework (GAF) research project first and foremost creates a common formal framework for describing current and future adaptive hypermedia (AHS) and adaptive webbased systems in general. It provides a commonly agreed upon taxonomy and a reference model that encompasses the most general architectures of the present and future, including conventional AHS, and different types of personalization-enabling systems and applications such as recommender systems (RS) personalized web search, semantic web enabled applications used in personalized information delivery, adaptive e-Learning applications and many more. At the same time GAF is trying to bring together two (seemingly not intersecting) views on the adaptation: a classical pre-authored type, with conventional domain and overlay user models and data-driven adaptation which includes a set of data mining, machine learning and information retrieval tools. To bring these research fields together we conducted a number GAF compliance studies including RS, AHS, and other applications combining adaptation, recommendation and search. We also performed a number of real systems’ case-studies to prove the point and perform a detailed analysis and evaluation of the framework. Secondly it introduces a number of new ideas in the field of AH, such as the Generic Adaptation Process (GAP) which aligns with a layered (data-oriented) architecture and serves as a reference adaptation process. This also helps to understand the compliance features mentioned earlier. Besides that GAF deals with important and novel aspects of adaptation enabling and leveraging technologies such as provenance and versioning. The existence of such a reference basis should stimulate AHS research and enable researchers to demonstrate ideas for new adaptation methods much more quickly than if they had to start from scratch. GAF will thus help bootstrap any adaptive web-based system research, design, analysis and evaluation
    corecore