256 research outputs found

    Applications of big knowledge summarization

    Get PDF
    Advanced technologies have resulted in the generation of large amounts of data ( Big Data ). The Big Knowledge derived from Big Data could be beyond humans\u27 ability of comprehension, which will limit the effective and innovative use of Big Knowledge repository. Biomedical ontologies, which play important roles in biomedical information systems, constitute one kind of Big Knowledge repository. Biomedical ontologies typically consist of domain knowledge assertions expressed by the semantic connections between tens of thousands of concepts. Without some high-level visual representation of Big Knowledge in biomedical ontologies, humans cannot grasp the big picture of those ontologies. Such Big Knowledge orientation is required for the proper maintenance of ontologies and their effective use. This dissertation is addressing the Big Knowledge challenge - How to enable humans to use Big Knowledge correctly and effectively (referred to as the Big Knowledge to Use (BK2U) problem) - with a focus on biomedical ontologies. In previous work, Abstraction Networks (AbNs) have been demonstrated successful for the summarization, visualization and quality assurance (QA) of biomedical ontologies. Based on the previous research, this dissertation introduces new AbNs of various granularities for Big Knowledge summarization and extends the applications of AbNs. This dissertation consists of three main parts. The first part introduces two advanced AbNs. One is the weighted aggregate partial-area taxonomy with a parameter to flexibly control the summarization granularity. The second is the Ingredient Abstraction Network (IAbN) for the National Drug File - Reference Terminology (NDF-RT) Chemical Ingredients hierarchy, for which the previously developed AbNs for hierarchies with outgoing relationships, are not applicable. Since NDF-RT\u27s Chemical Ingredients hierarchy has no outgoing relationships. The second part describes applications of the two advanced AbNs. A study utilizing the weighted aggregate partial-area taxonomy for the identification of major topics in SNOMED CT\u27s Specimen hierarchy is reported. A multi-layer interactive visualization system of required granularity for ontology comprehension, based on the weighted aggregate partial-area taxonomy, is demonstrated to comprehend the Neoplasm subhierarchy of National Cancer Institute thesaurus (NCIt). The IAbN is applied for drug-drug interaction (DDI) discovery. The third part reports eight family-based QA studies on NCIt\u27s Neoplasm, Gene, and Biological Process hierarchies, SNOMED CT\u27s Infectious disease hierarchy, the Chemical Entities of Biological Interest ontology, and the Chemical Ingredients hierarchy in NDF-RT. There is no one-size-fits-all QA method and it is impossible to find a QA method for each individual ontology. Hence, family-based QA is an effective way, i.e., one QA technique could be applicable to a whole family of structurally similar ontologies. The results of these studies demonstrate that complex concepts and uncommonly modeled concepts are more likely to have errors. Furthermore, the three studies on overlapping concepts in partial-area taxonomies reported in this dissertation combined with previous three studies prove the success of overlapping concepts as a QA methodology for a whole family of 76 similar ontologies in BioPortal

    Ontology Enrichment from Free-text Clinical Documents: A Comparison of Alternative Approaches

    Get PDF
    While the biomedical informatics community widely acknowledges the utility of domain ontologies, there remain many barriers to their effective use. One important requirement of domain ontologies is that they achieve a high degree of coverage of the domain concepts and concept relationships. However, the development of these ontologies is typically a manual, time-consuming, and often error-prone process. Limited resources result in missing concepts and relationships, as well as difficulty in updating the ontology as domain knowledge changes. Methodologies developed in the fields of Natural Language Processing (NLP), Information Extraction (IE), Information Retrieval (IR), and Machine Learning (ML) provide techniques for automating the enrichment of ontology from free-text documents. In this dissertation, I extended these methodologies into biomedical ontology development. First, I reviewed existing methodologies and systems developed in the fields of NLP, IR, and IE, and discussed how existing methods can benefit the development of biomedical ontologies. This previously unconducted review was published in the Journal of Biomedical Informatics. Second, I compared the effectiveness of three methods from two different approaches, the symbolic (the Hearst method) and the statistical (the Church and Lin methods), using clinical free-text documents. Third, I developed a methodological framework for Ontology Learning (OL) evaluation and comparison. This framework permits evaluation of the two types of OL approaches that include three OL methods. The significance of this work is as follows: 1) The results from the comparative study showed the potential of these methods for biomedical ontology enrichment. For the two targeted domains (NCIT and RadLex), the Hearst method revealed an average of 21% and 11% new concept acceptance rates, respectively. The Lin method produced a 74% acceptance rate for NCIT; the Church method, 53%. As a result of this study (published in the Journal of Methods of Information in Medicine), many suggested candidates have been incorporated into the NCIT; 2) The evaluation framework is flexible and general enough that it can analyze the performance of ontology enrichment methods for many domains, thus expediting the process of automation and minimizing the likelihood that key concepts and relationships would be missed as domain knowledge evolves

    Ontological representation of tumor-node-metastasis classification and an ontology-driven classifier: a study on colorectal cancer

    Get PDF
    Dissertação de mestrado integrado em Engenharia Biomédica (área de especialização em Informática Médica)The most important staging system for cancer is the TNM Classification of Malignant Tumors (TNM) classification. The staging procedure compiles several clinical and pathological parameters based on the Extent of Disease (EOD). The objectives of this work are to present the Tumor-Nodes-Metastasis Ontology (TNM-O), a framework for the representation of the TNM classification of malignant tumors (TNM) system; to implement the TNM Colon and Rectum ontology, a modular ontology that represents the TNM classification for the colorectal tumors based on this framework; to develop an ontologically driven classifier application with the TNM-O as it’s knowledge base and to show the feasibility of this approach on real data. TNM Ontology (TNM-O) and TNM Colon and Rectum Ontology (TNMCRO) use the Foundational Model of Anatomy (FMA) for representing anatomical entities and BioTopLite2 (BTL2) as a domain top-level ontology. The classification rules of the TNM classification for colorectal tumors were represented as described in the literature. The automatic classifier for pathological data uses these ontologies as knowledge base. It was developed with JAVA using the Ontology Web Language (OWL)-application programming interface (API) to make the bridge between the application level and knowledge base. In this study, two datasets with real data where evaluated. The first dataset contained 382 entries that was classified by the regional lymph nodes. This study compared automatic classification with the expert one and obtained an accuracy of 55%. However, the classifier flagged inconsistencies and errors made during the manual tumor documentation that caused the misclassification. The second dataset contained 292 records carefully classified by a pathologist. In this dataset, automatic classification was optimal to all types of assessment. Therefore, this study proved that an ontology-driven automatic classifier enhances the consistency in tumor documentation and provides accurate instance classification during pathological assessment of tumors.O sistema para classificação de tumores malignos mais aceite globalmente é o Tumor-Nódulos-Metásteses Classificação de Tumores Malignos (TNM). O procedimento de classificação compreende diversos parametros patológicos baseados na Extenção da Doença (EOD). Os objetivos deste trabalho consistem na apresentação da ontologia TNM-O, uma ferramenta utilizada na representação do sistema de classificação TNM; na implementação da ontologia Colon and RectumTNM-CR, uma ontologia modular que representa as regras de classificação TNM referentes aos cancros no cólon e no recto, no desenvolvimento de uma aplicação, cuja base de conhecimento é a ontologia TNM-O e no teste de viabilidade desta abordagem com dados reais. A ontologia TNM representa todas as definições e regras presentes na classificação TNM. Esta ontologia é o ponto central de um sistema desenvolvido com base numa arquitetura modular. Cada módulo consiste numa ontologia que representa as regras de classificação respetivas aos diferentes tumores. Estas ontologias podem ser importadas para a ontologia central, sendo que todas utilizam o Foundational Model of Anatomy (FMA) para representar os conceitos anatómicos e o BioTopLite 2 como ontologia de domínio. A aplicação desenvolvida para a classificação de ontologias tem como base de conhecimeto a ontologia TNM. Esta foi programada em JAVA utilizando a OWL-API como ponte entre a aplicação e a base de conhecimento. Neste estudo foram avaliados dois dataset com dados reais. O primeiro continha 382 registos que foram classificados pelos nódulos regionais. Comparando classificação automática com a manual obteve-se uma precisão de 55%. No entanto, a aplicação apontou inconsistências e erros feitos na documentação do tumor que causou este resultado. O segundo dataset consistia em 292 registos produzidos e classificados manualmente por um patologista através de documentos em texto. A classificação automática revelou resultados ótimos para todos os tipos de classificação Este estudo mostrou que a aplicação desenvolvida melhora a consistência e eficiência dos dados na documentação de tumores assim como providencia classificação automática exata durante o processo de diagnóstico do tumor

    Medical Informatics

    Get PDF
    Information technology has been revolutionizing the everyday life of the common man, while medical science has been making rapid strides in understanding disease mechanisms, developing diagnostic techniques and effecting successful treatment regimen, even for those cases which would have been classified as a poor prognosis a decade earlier. The confluence of information technology and biomedicine has brought into its ambit additional dimensions of computerized databases for patient conditions, revolutionizing the way health care and patient information is recorded, processed, interpreted and utilized for improving the quality of life. This book consists of seven chapters dealing with the three primary issues of medical information acquisition from a patient's and health care professional's perspective, translational approaches from a researcher's point of view, and finally the application potential as required by the clinicians/physician. The book covers modern issues in Information Technology, Bioinformatics Methods and Clinical Applications. The chapters describe the basic process of acquisition of information in a health system, recent technological developments in biomedicine and the realistic evaluation of medical informatics

    Assembling models of embryo development: Image analysis and the construction of digital atlases

    Get PDF
    Digital atlases of animal development provide a quantitative description of morphogenesis, opening the path toward processes modeling. Prototypic atlases offer a data integration framework where to gather information from cohorts of individuals with phenotypic variability. Relevant information for further theoretical reconstruction includes measurements in time and space for cell behaviors and gene expression. The latter as well as data integration in a prototypic model, rely on image processing strategies. Developing the tools to integrate and analyze biological multidimensional data are highly relevant for assessing chemical toxicity or performing drugs preclinical testing. This article surveys some of the most prominent efforts to assemble these prototypes, categorizes them according to salient criteria and discusses the key questions in the field and the future challenges toward the reconstruction of multiscale dynamics in model organisms

    Connecting GOMMA with STROMA: an approach for semantic ontology mapping in the biomedical domain

    Get PDF
    This thesis establishes a connection between GOMMA and STROMA – both are tools of ontology processing. Consequently, a new workflow of denoting a set of correspondences with five semantic relation types has been implemented. Such a rich denotation is scarcely discussed within the literature. The evaluation of the denotation shows that trivial correspondences are easy to recognize (tF > 90). The challenge is the denotation of non-trivial types ( 30 < ntF < 70). A prerequisite of the implemented workflow is the extraction of semantic relations between concepts. These relations represent additional background knowledge for the enrichment tool STROMA and are integrated to the repository SemRep which is accessed by this tool. Thus, STROMA is able to calculate a semantic type more precisely. UMLS was chosen as a biomedical knowledge source because it subsumes many different ontologies of this domain and thus, it represents a rich resource. Nevertheless, only a small set of relations met the requirements which are imposed to SemRep relations. Further studies may analyze whether there is an appropriate way to integrate the missing relations as well. The connection of GOMMA with STROMA allows the semantic enrichment of a biomedical mapping. As a consequence, this thesis enlightens two subjects of research. First, STROMA had been tested with general ontologies, which models common sense knowledge. Within this thesis, STROMA was applied to domain ontologies. Studies have shown that overall, STROMA was able to treat such ontologies as well. However, some strategies for the enrichment process are based on assumption which are misleading in the biomedical domain. Consequently, further strategies are suggested in this thesis which might improve the type denotation. These strategies may lead to an optimization of STROMA for biomedical data sets. A more thorough analysis will review their scope, also beyond the biomedical domain. Second, the established connection may lead to deeper investigations about advantages of semantic enrichment in the biomedical domain as an enriched mapping is returned. Despite heterogeneity of source and target ontology, such a mapping results in an improved interoperability at a finer level of granularity. The utilization of semantically rich correspondences in the biomedical domain is a worthwhile focus for future research

    Lipoprotein ontology: a formal representation of Lipoproteins

    Get PDF
    Lipoproteins serve as a mode of transport for the uptake, storage and metabolism of lipids. Dysregulation in lipoprotein metabolism, known as dyslipidaemia, is strongly correlated to various diseases such as cardiovascular disease. Lipoprotein Ontology provides a formal representation of lipoprotein concepts and relationships that can be used to support the intelligent retrieval of information, faciliate collaboration between research groups, and provide the basis for the development of tools for the diagnosis and treatment of dyslipidaemia

    Assessment of brain cancer atlas maps with multimodal imaging features.

    Get PDF
    BACKGROUND: Glioblastoma Multiforme (GBM) is a fast-growing and highly aggressive brain tumor that invades the nearby brain tissue and presents secondary nodular lesions across the whole brain but generally does not spread to distant organs. Without treatment, GBM can result in death in about 6 months. The challenges are known to depend on multiple factors: brain localization, resistance to conventional therapy, disrupted tumor blood supply inhibiting effective drug delivery, complications from peritumoral edema, intracranial hypertension, seizures, and neurotoxicity. MAIN TEXT: Imaging techniques are routinely used to obtain accurate detections of lesions that localize brain tumors. Especially magnetic resonance imaging (MRI) delivers multimodal images both before and after the administration of contrast, which results in displaying enhancement and describing physiological features as hemodynamic processes. This review considers one possible extension of the use of radiomics in GBM studies, one that recalibrates the analysis of targeted segmentations to the whole organ scale. After identifying critical areas of research, the focus is on illustrating the potential utility of an integrated approach with multimodal imaging, radiomic data processing and brain atlases as the main components. The templates associated with the outcome of straightforward analyses represent promising inference tools able to spatio-temporally inform on the GBM evolution while being generalizable also to other cancers. CONCLUSIONS: The focus on novel inference strategies applicable to complex cancer systems and based on building radiomic models from multimodal imaging data can be well supported by machine learning and other computational tools potentially able to translate suitably processed information into more accurate patient stratifications and evaluations of treatment efficacy

    Information Systems and Healthcare XXXIV: Clinical Knowledge Management Systems—Literature Review and Research Issues for Information Systems

    Get PDF
    Knowledge Management (KM) has emerged as a possible solution to many of the challenges facing U.S. and international healthcare systems. These challenges include concerns regarding the safety and quality of patient care, critical inefficiency, disparate technologies and information standards, rapidly rising costs and clinical information overload. In this paper, we focus on clinical knowledge management systems (CKMS) research. The objectives of the paper are to evaluate the current state of knowledge management systems diffusion in the clinical setting, assess the present status and focus of CKMS research efforts, and identify research gaps and opportunities for future work across the medical informatics and information systems disciplines. The study analyzes the literature along two dimensions: (1) the knowledge management processes of creation, capture, transfer, and application, and (2) the clinical processes of diagnosis, treatment, monitoring and prognosis. The study reveals that the vast majority of CKMS research has been conducted by the medical and health informatics communities. Information systems (IS) researchers have played a limited role in past CKMS research. Overall, the results indicate that there is considerable potential for IS researchers to contribute their expertise to the improvement of clinical process through technology-based KM approaches
    • …
    corecore