46 research outputs found

    Application of information extraction techniques to pharmacological domain : extracting drug-drug interactions

    Get PDF
    Una interacción farmacológica ocurre cuando los efectos de un fármaco se modifican por la presencia de otro. Las consecuencias pueden ser perjudiciales si la interacción causa un aumento de la toxicidad del fármaco o la disminución de su efecto, pudiendo provocar incluso la muerte del paciente en los peores casos. Las interacciones farmacológicas no sólo suponen un grave problema para la seguridad del paciente, sino que además también conllevan un importante incremento en el gasto médico. En la actualidad, el personal sanitario tiene a su disposición diversas bases de datos sobre interacciones que permiten evitar posibles interacciones a la hora de prescribir un determinado tratamiento, sin embargo, estas bases de datos no están completas. Por este motivo, médicos y farmacéuticos se ven obligados a revisar una gran cantidad de artículos científicos e informes sobre seguridad de medicamentos para estar al día de todo lo publicado en relación al tema. Desgraciadamente, el gran volumen de información al respecto hace que estos profesionales estén desbordados ante tal avalancha. El desarrollo de métodos automáticos que permitan recopilar, mantener e interpretar toda esta información es crucial a la hora de conseguir una mejora real en la detección temprana de las interacciones entre fármacos. Por tanto, la extracción de información podría reducir el tiempo empleado por el personal médico en la revisión de la literatura médica. Sin embargo, la extracción de interacciones farmacológicas a partir textos biomédicos no ha sido dirigida hasta el momento. Motivados por estos aspectos, en esta tesis hemos realizado un estudio detallado sobre diversas técnicas de extracción de información aplicadas al dominio farmacológico. Basándonos en este estudio, hemos propuesto dos aproximaciones distintas para la extracción de interacciones farmacológicas de los textos. Nuestra primera aproximación propone un enfoque híbrido, que combina análisis sintáctico superficial y la aplicación de patrones léxicos definidos por un farmacéutico. La segunda aproximación se aborda mediante aprendizaje supervisado, concretamente, el uso de métodos kernels. Además, se han desarrollado las siguientes tareas auxiliares: (1) el análisis de los textos utilizando la herramienta UMLS MetaMap Transfer (MMTx), que proporciona información sintáctica y semántica, (2) un proceso para identificar y clasificar los nombres de fármacos que ocurren en los textos, y (3) un proceso para reconoger las expresiones anafóricas que se refieren a fármacos. Un prototipo ha sido desarrollado para integrar y combinar las distintas técnicas propuestas en esta tesis. Para la evaluación de las dos propuestas, con la ayuda de un farmacéutico desarrollamos y anotamos un corpus con interacciones farmacológicas. El corpus DrugDDI es una de las principales aportaciones de la tesis, ya que es el primer corpus en el dominio biomédico anotado con este tipo de información y porque creemos que puede alentar la investigación sobre extracción de información en el dominio farmacológico. Los experimentos realizados demuestran que el enfoque basado en kernels consigue mejores resultados que los reportados por el enfoque que utiliza información sintáctica y patrones léxicos. Además, los kernels consiguen resultados comparables a los obtenidos en dominios similares como son las interacciones entre proteínas. Esta tesis se ha llevado a cabo en el marco del consorcio de investigación MAVIRCM (Mejorando el acceso y visibilidad de la información multilingüe en red para la Comunidad de Madrid, www.mavir.net) dentro del Programa de Actividades de I+D en Tecnologías 2005-2008 de la Comunidad de Madrid (S-0505/TIC-0267) así como en el proyecto de investigación BRAVO: ”Búsqueda de Respuestas Avanzada Multimodal y Multilingüe” (TIN2007-67407-C03-01).----------------------------------------------------------------------------------------A drug-drug interaction occurs when one drug influences the level or activity of another drug. The detection of drug interactions is an important research area in patient safety since these interactions can become very dangerous and increase health care costs. Although there are different databases supporting health care professionals in the detection of drug interactions, this kind of resource is rarely complete. Drug interactions are frequently reported in journals of clinical pharmacology, making medical literature the most effective source for the detection of drug interactions. However, the increasing volume of the literature overwhelms health care professionals trying to keep an up-to-date collection of all reported drug-drug interactions. The development of automatic methods for collecting, maintaining and interpreting this information is crucial for achieving a real improvement in their early detection. Information Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract drug-drug interactions from biomedical texts. In this thesis, we have conducted a detailed study on various IE techniques applied to biomedical domain. Based on this study, we have proposed two different approximations for the extraction of drug-drug interactions from texts. The first approximation proposes a hybrid approach, which combines shallow parsing and pattern matching to extract relations between drugs from biomedical texts. The second approximation is based on a supervised machine learning approach, in particular, kernel methods. In addition, we have created and annotated the first corpus, DrugDDI, annotated with drug-drug interactions, which allow us to evaluate and compare both approximations. To the best of our knowledge, the DrugDDI corpus is the only available corpus annotated for drug-drug interactions and this thesis is the first work which addresses the problem of extracting drug-drug interactions from biomedical texts. We believe the DrugDDI corpus is an important contribution because it could encourage other research groups to research into this problem. We have also defined three auxiliary processes to provide crucial information, which will be used by the aforementioned approximations. These auxiliary tasks are as follows: (1) a process for text analysis based on the UMLS MetaMap Transfer tool (MMTx) to provide shallow syntactic and semantic information from texts, (2) a process for drug name recognition and classification, and (3) a process for drug anaphora resolution. Finally, we have developed a pipeline prototype which integrates the different auxiliary processes. The pipeline architecture allows us to easily integrate these modules with each of the approaches proposed in this thesis: pattern-matching or kernels. Several experiments were performed on the DrugDDI corpus. They show that while the first approximation based on pattern matching achieves low performance, the approach based on kernel-methods achieves a performance comparable to those obtained by approaches which carry out a similar task such as the extraction of protein-protein interactions. This work has been partially supported by the Spanish research projects: MAVIR consortium (S-0505/TIC-0267, www.mavir.net), a network of excellence funded by the Madrid Regional Government and TIN2007-67407-C03-01 (BRAVO: Advanced Multimodal and Multilingual Question Answering)

    A linguistic rule-based approach to extract drug-drug interactions from pharmacological documents

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A drug-drug interaction (DDI) occurs when one drug influences the level or activity of another drug. The increasing volume of the scientific literature overwhelms health care professionals trying to be kept up-to-date with all published studies on DDI.</p> <p>Methods</p> <p>This paper describes a hybrid linguistic approach to DDI extraction that combines shallow parsing and syntactic simplification with pattern matching. Appositions and coordinate structures are interpreted based on shallow syntactic parsing provided by the UMLS MetaMap tool (MMTx). Subsequently, complex and compound sentences are broken down into clauses from which simple sentences are generated by a set of simplification rules. A pharmacist defined a set of domain-specific lexical patterns to capture the most common expressions of DDI in texts. These lexical patterns are matched with the generated sentences in order to extract DDIs.</p> <p>Results</p> <p>We have performed different experiments to analyze the performance of the different processes. The lexical patterns achieve a reasonable precision (67.30%), but very low recall (14.07%). The inclusion of appositions and coordinate structures helps to improve the recall (25.70%), however, precision is lower (48.69%). The detection of clauses does not improve the performance.</p> <p>Conclusions</p> <p>Information Extraction (IE) techniques can provide an interesting way of reducing the time spent by health care professionals on reviewing the literature. Nevertheless, no approach has been carried out to extract DDI from texts. To the best of our knowledge, this work proposes the first integral solution for the automatic extraction of DDI from biomedical texts.</p

    Anaphora resolution for Arabic machine translation :a case study of nafs

    Get PDF
    PhD ThesisIn the age of the internet, email, and social media there is an increasing need for processing online information, for example, to support education and business. This has led to the rapid development of natural language processing technologies such as computational linguistics, information retrieval, and data mining. As a branch of computational linguistics, anaphora resolution has attracted much interest. This is reflected in the large number of papers on the topic published in journals such as Computational Linguistics. Mitkov (2002) and Ji et al. (2005) have argued that the overall quality of anaphora resolution systems remains low, despite practical advances in the area, and that major challenges include dealing with real-world knowledge and accurate parsing. This thesis investigates the following research question: can an algorithm be found for the resolution of the anaphor nafs in Arabic text which is accurate to at least 90%, scales linearly with text size, and requires a minimum of knowledge resources? A resolution algorithm intended to satisfy these criteria is proposed. Testing on a corpus of contemporary Arabic shows that it does indeed satisfy the criteria.Egyptian Government

    Human reasoning and cognitive science

    Get PDF
    In the late summer of 1998, the authors, a cognitive scientist and a logician, started talking about the relevance of modern mathematical logic to the study of human reasoning, and we have been talking ever since. This book is an interim report of that conversation. It argues that results such as those on the Wason selection task, purportedly showing the irrelevance of formal logic to actual human reasoning, have been widely misinterpreted, mainly because the picture of logic current in psychology and cognitive science is completely mistaken. We aim to give the reader a more accurate picture of mathematical logic and, in doing so, hope to show that logic, properly conceived, is still a very helpful tool in cognitive science. The main thrust of the book is therefore constructive. We give a number of examples in which logical theorizing helps in understanding and modeling observed behavior in reasoning tasks, deviations of that behavior in a psychiatric disorder (autism), and even the roots of that behavior in the evolution of the brain

    Incremental Coreference Resolution for German

    Full text link
    The main contributions of this thesis are as follows: 1. We introduce a general model for coreference and explore its application to German. • The model features an incremental discourse processing algorithm which allows it to coherently address issues caused by underspecification of mentions, which is an especially pressing problem regarding certain German pronouns. • We introduce novel features relevant for the resolution of German pronouns. A subset of these features are made accessible through the incremental architecture of the discourse processing model. • In evaluation, we show that the coreference model combined with our features provides new state-of-the-art results for coreference and pronoun resolution for German. 2. We elaborate on the evaluation of coreference and pronoun resolution. • We discuss evaluation from the view of prospective downstream applications that benefit from coreference resolution as a preprocessing component. Addressing the shortcomings of the general evaluation framework in this regard, we introduce an alternative framework, the Application Related Coreference Scores (ARCS). • The ARCS framework enables a thorough comparison of different system outputs and the quantification of their similarities and differences beyond the common coreference evaluation. We demonstrate how the framework is applied to state-of-the-art coreference systems. This provides a method to track specific differences in system outputs, which assists researchers in comparing their approaches to related work in detail. 3. We explore semantics for pronoun resolution. • Within the introduced coreference model, we explore distributional approaches to estimate the compatibility of an antecedent candidate and the occurrence context of a pronoun. We compare a state-of-the-art approach for word embeddings to syntactic co-occurrence profiles to this end. • In comparison to related work, we extend the notion of context and thereby increase the applicability of our approach. We find that a combination of both compatibility models, coupled with the coreference model, provides a large potential for improving pronoun resolution performance. We make available all our resources, including a web demo of the system, at: http://pub.cl.uzh.ch/purl/coreference-resolutio

    A taxonomy of problems in arabic-english Translation: a systemic functional Linguistics approach Tawffeek abdou

    Get PDF
    Philosophiae Doctor - PhDWorking with Arab students pursuing a degree in English Language and Translation at the Taiz University, Republic of Yemen, has brought to the researcher‟s attention a number of errors or problems encountered in Arabic to English translation. This study aims to investigate the problems encountered by student translators (STs), novice translators (NTs) as well as more experienced translators (Ts) while translating from Arabic into English. The study starts with the assumption that Arabic and English belong to different families of languages and thus there is rarely a word-for-word equivalence in both languages. The present study is cross-sectional in nature. It is based on empirical data collected from several categories of translators. In other words, the data was collected from fourth-year students in the department of English and Translation in the Faculty of Arts, Taiz University, as well as five NTs who have previously graduated from this department and are currently working in a number of accredited translation offices in Taiz. The study also investigates the challenges faced by Ts. For this purpose, a novel, a tourist brochure, an editorial, and three academic abstracts all translated by established publishing houses and translation centres in and outside Yemen are examined. These texts are analyzed to determine to what extent the problems faced by STs and NTs reoccur in published translations produced by Ts. For its conceptual framework, the study adopts an eclectic approach that does not stick rigidly to a particular paradigm but rather draws upon multiple linguistic and translation theories. However, it is mainly based on Halliday‟s Systemic Functional Grammar (SFG) and the problems have been classified along his taxonomy of meaning metafunctions into ideational, interpersonal and textual. Extra-textual problems are also analyzed. Several SFG-based translation models such as Hatim and Mason‟s (1990) sociometic model, House‟s (1977, 1997) translation quality assessment model, Hervey et al. (1992) register analysis model and Baker‟s (1990) equivalence model are also employed in the study to help the researcher examine the problems encountered in Arabic-English translation within those four categories. In addition, Nord‟s functional model to translation which is based on Skopos theory is also taken into consideration although to a minimum extent. In addition to the analysis of translations produced by various categories of translators, the study uses several triangulation research tools such as questionnaire, Thinking Aloud Protocols (TAPs), retrospective interviews, and classroom observation. These tools are employed to assist the researcher to identify the possible causes for the problems the STs, NTs, and Ts experience from the perspective of the participants themselves. The current translation programme at Taiz University is also analyzed to determine to what extent it contributes to the poor performance of the student translators and would-be translators. The study concludes that STs, NTs and even Ts encounter several problems at the ideational, interpersonal and textual levels. They also encounter problems at the extra-textual stratum. The study attributes these problems to structural and cultural differences between the two languages, the reliance on the dictionary rather than the meaning in use of lexical items, the differences in the cohesion and coherence systems of Arabic and English, the negligence of the role of context in translation as well as unfamiliarity with text-typologies and genre conventions. In other words, participants follow a bottom-up approach in translation and come close to the source text translating it literally. This approach is very damaging because it ignores the fact that the three metafunctions might be realized differently in the two languages. Furthermore, the study concludes that the manner in which translation is taught at Taiz University as well as the syllabus contribute mainly to the lack of translation competence of the student translators and would-be translators. The programme is inadequate and it needs urgent review and improvements. The present syllabus does not keep abreast with the latest theoretical and practical developments in the discipline of translation as well as neighbouring disciplines such as contrastive linguistics, text-analysis, discourse analysis, corpus linguistics and the like. As for methodology, the study concludes that it is the transmissionist (teacher-centred) teaching approach rather than the transformational (learner-centred) which is commonly used in teaching translation. As a result, the read-and-translate approach dominates the scene and no tasks, activities, or projects are given to the STs. The study provides some recommendations, which if implemented, can be useful in enabling Yemeni and Arab universities to improve the competence among student translators in order to improve translation teaching at academic level. A major contribution of this study is the description and classification of translation problems in Arabic-English translation on the basis of meaning systems. Unlike traditional descriptive error analysis, which is widely used to analyze the translation product, SFG-based text analysis provides a systematic description of translation problems which allows a precise articulation of the nature of problems that would otherwise be explained simply as translations which “sound unnatural or awkward” (Kim 2008; Yallop 1999). As far as the researcher knows, no study in the Arab world has yet tackled translation problems from this perspective. Other studies have tackled deviated forms produced by students or translators using an error analysis technique rather than a holistic approach based on solid theoretical knowledge. In other words, while most other studies focused on specific „errors‟ and error analysis and ended at that, the present study does not only looks at „errors‟ as „difference‟ (from contrastive analysis) but rather from several perspectives. It is also more comprehensive by triangulating several sources of data and pooling them together for a more informed understanding

    Intelligent Systems

    Get PDF
    This book is dedicated to intelligent systems of broad-spectrum application, such as personal and social biosafety or use of intelligent sensory micro-nanosystems such as "e-nose", "e-tongue" and "e-eye". In addition to that, effective acquiring information, knowledge management and improved knowledge transfer in any media, as well as modeling its information content using meta-and hyper heuristics and semantic reasoning all benefit from the systems covered in this book. Intelligent systems can also be applied in education and generating the intelligent distributed eLearning architecture, as well as in a large number of technical fields, such as industrial design, manufacturing and utilization, e.g., in precision agriculture, cartography, electric power distribution systems, intelligent building management systems, drilling operations etc. Furthermore, decision making using fuzzy logic models, computational recognition of comprehension uncertainty and the joint synthesis of goals and means of intelligent behavior biosystems, as well as diagnostic and human support in the healthcare environment have also been made easier

    Specialised Languages and Multimedia. Linguistic and Cross-cultural Issues

    Get PDF
    none2noThis book collects academic works focusing on scientific and technical discourse and on the ways in which this type of discourse appears in or is shaped by multimedia products. The originality of this book is to be seen in the variety of approaches used and of the specialised languages investigated in relation to multimodal and multimedia genres. Contributions will particularly focus on new multimodal or multimedia forms of specialised discourse (in institutional, academic, technical, scientific, social or popular settings), linguistic features of specialised discourse in multimodal or multimedia genres, the popularisation of specialised knowledge in multimodal or multimedia genres, the impact of multimodality and multimediality on the construction of scientific and technical discourse, the impact of multimodality/multimediality in the practice and teaching of language, the impact of multimodality/multimediality in the practice and teaching of translation, new multimedia modes of knowledge dissemination, the translation/adaptation of scientific discourse in multimedia products. This volume contributes to the theory and practice of multimodal studies and translation, with a specific focus on specialized discourse.Rivista di Classe A - Volume specialeopenManca E., Bianchi F.Manca, E.; Bianchi, F
    corecore