165 research outputs found

    Applications of big knowledge summarization

    Get PDF
    Advanced technologies have resulted in the generation of large amounts of data ( Big Data ). The Big Knowledge derived from Big Data could be beyond humans\u27 ability of comprehension, which will limit the effective and innovative use of Big Knowledge repository. Biomedical ontologies, which play important roles in biomedical information systems, constitute one kind of Big Knowledge repository. Biomedical ontologies typically consist of domain knowledge assertions expressed by the semantic connections between tens of thousands of concepts. Without some high-level visual representation of Big Knowledge in biomedical ontologies, humans cannot grasp the big picture of those ontologies. Such Big Knowledge orientation is required for the proper maintenance of ontologies and their effective use. This dissertation is addressing the Big Knowledge challenge - How to enable humans to use Big Knowledge correctly and effectively (referred to as the Big Knowledge to Use (BK2U) problem) - with a focus on biomedical ontologies. In previous work, Abstraction Networks (AbNs) have been demonstrated successful for the summarization, visualization and quality assurance (QA) of biomedical ontologies. Based on the previous research, this dissertation introduces new AbNs of various granularities for Big Knowledge summarization and extends the applications of AbNs. This dissertation consists of three main parts. The first part introduces two advanced AbNs. One is the weighted aggregate partial-area taxonomy with a parameter to flexibly control the summarization granularity. The second is the Ingredient Abstraction Network (IAbN) for the National Drug File - Reference Terminology (NDF-RT) Chemical Ingredients hierarchy, for which the previously developed AbNs for hierarchies with outgoing relationships, are not applicable. Since NDF-RT\u27s Chemical Ingredients hierarchy has no outgoing relationships. The second part describes applications of the two advanced AbNs. A study utilizing the weighted aggregate partial-area taxonomy for the identification of major topics in SNOMED CT\u27s Specimen hierarchy is reported. A multi-layer interactive visualization system of required granularity for ontology comprehension, based on the weighted aggregate partial-area taxonomy, is demonstrated to comprehend the Neoplasm subhierarchy of National Cancer Institute thesaurus (NCIt). The IAbN is applied for drug-drug interaction (DDI) discovery. The third part reports eight family-based QA studies on NCIt\u27s Neoplasm, Gene, and Biological Process hierarchies, SNOMED CT\u27s Infectious disease hierarchy, the Chemical Entities of Biological Interest ontology, and the Chemical Ingredients hierarchy in NDF-RT. There is no one-size-fits-all QA method and it is impossible to find a QA method for each individual ontology. Hence, family-based QA is an effective way, i.e., one QA technique could be applicable to a whole family of structurally similar ontologies. The results of these studies demonstrate that complex concepts and uncommonly modeled concepts are more likely to have errors. Furthermore, the three studies on overlapping concepts in partial-area taxonomies reported in this dissertation combined with previous three studies prove the success of overlapping concepts as a QA methodology for a whole family of 76 similar ontologies in BioPortal

    Exploring the Development and Maintenance Practices in the Gene Ontology

    Get PDF
    The Gene Ontology (GO) is one of the most widely used and successful bio-ontologies in biomedicine and molecular biology. What is special about GO as a knowledge organization (KO) system is its collaborative development and maintenance practices, involving diverse communities in collectively developing the Ontology and controlling its quality. Guided by Activity Theory and a theoretical Information Quality Assessment Framework, this study conducts qualitative content analysis of GO’s curation discussions. The study found that GO has developed various tools and mechanisms to gain expert feedback and engage various communities in developing and maintaining the Ontology in an efficient and less expensive way. The findings of this study can inform KO system designers, curators, and ontologists in establishing functional requirements and quality assurance infrastructure for bioontologies and formulating best practices for ontology development

    Saliva Ontology: An ontology-based framework for a Salivaomics Knowledge Base

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Salivaomics Knowledge Base (SKB) is designed to serve as a computational infrastructure that can permit global exploration and utilization of data and information relevant to salivaomics. SKB is created by aligning (1) the saliva biomarker discovery and validation resources at UCLA with (2) the ontology resources developed by the OBO (Open Biomedical Ontologies) Foundry, including a new Saliva Ontology (SALO).</p> <p>Results</p> <p>We define the Saliva Ontology (SALO; <url>http://www.skb.ucla.edu/SALO/</url>) as a consensus-based controlled vocabulary of terms and relations dedicated to the salivaomics domain and to saliva-related diagnostics following the principles of the OBO (Open Biomedical Ontologies) Foundry.</p> <p>Conclusions</p> <p>The Saliva Ontology is an ongoing exploratory initiative. The ontology will be used to facilitate salivaomics data retrieval and integration across multiple fields of research together with data analysis and data mining. The ontology will be tested through its ability to serve the annotation ('tagging') of a representative corpus of salivaomics research literature that is to be incorporated into the SKB.</p

    A Framework for Tracing the Flavouring Information to Accelerate Halal Certification

    Get PDF
    Halal industry is a new sector in the manufacturing industry in Malaysia and is a fast-growing global business. In Malaysia, JAKIM is the body responsible in matters relating to approve the halal certification. However, the process of issuing the halal certificate is time consuming. Based on the delay in issuing the halal certificate, this study conducted a case study to examine issues in halal certification. The reasons for the delay in issuing halal certification is the constraints in determining halal status of flavouring due to the absence of halal certificate when auditors were processing the documentation for applying certification. In addition, the inconsistent use of terms among the food producers and the auditors makes it difficult to trace halal status of flavouring. The case study also found that there is no framework that can help to trace the halal status of flavouring ingredient systematically. Thus, the study contributes a framework for tracing flavouring information to accelerate halal certification

    Identification of OBO nonalignments and its implications for OBO enrichment

    Get PDF
    Motivation: Existing projects that focus on the semiautomatic addition of links between existing terms in the Open Biomedical Ontologies can take advantage of reasoners that can make new inferences between terms that are based on the added formal definitions and that reflect nonalignments between the linked terms. However, these projects require that these definitions be necessary and sufficient, a strong requirement that often does not hold. If such definitions cannot be added, the reasoners cannot point to the nonalignments through the suggestion of new inferences

    Chemical information matters: an e-Research perspective on information and data sharing in the chemical sciences

    No full text
    Recently, a number of organisations have called for open access to scientific information and especially to the data obtained from publicly funded research, among which the Royal Society report and the European Commission press release are particularly notable. It has long been accepted that building research on the foundations laid by other scientists is both effective and efficient. Regrettably, some disciplines, chemistry being one, have been slow to recognise the value of sharing and have thus been reluctant to curate their data and information in preparation for exchanging it. The very significant increases in both the volume and the complexity of the datasets produced has encouraged the expansion of e-Research, and stimulated the development of methodologies for managing, organising, and analysing "big data". We review the evolution of cheminformatics, the amalgam of chemistry, computer science, and information technology, and assess the wider e-Science and e-Research perspective. Chemical information does matter, as do matters of communicating data and collaborating with data. For chemistry, unique identifiers, structure representations, and property descriptors are essential to the activities of sharing and exchange. Open science entails the sharing of more than mere facts: for example, the publication of negative outcomes can facilitate better understanding of which synthetic routes to choose, an aspiration of the Dial-a-Molecule Grand Challenge. The protagonists of open notebook science go even further and exchange their thoughts and plans. We consider the concepts of preservation, curation, provenance, discovery, and access in the context of the research lifecycle, and then focus on the role of metadata, particularly the ontologies on which the emerging chemical Semantic Web will depend. Among our conclusions, we present our choice of the "grand challenges" for the preservation and sharing of chemical information

    Investigation into the use of NMR-based bioinformatics in determining the composition and quality of immune supplements in Australia

    Get PDF
    The outbreak of the SARS-CoV-2 virus has brought prominence to the concept of immune health for individuals. A common means of attempting to do so is by incorporating immune supplements into everyday life. While immune supplements generally contain well-documented traditional herbs, knowledge about the quality and safety of these commercial products is minimal. In Australia, the Therapeutic Goods Administration (TGA) regulates and enforces advertising, labelling and compositional consistency of immune supplements; however, minimal pre-market assessment omits the potential harm and adulteration regularly cited in the literature. A multifaceted approach to these products’ overall safety and quality is essential in safeguarding human health. Following TGA guidelines, seventeen immune supplements were investigated for their labelling compliance with the Therapeutic Goods Order No. 92 for non-prescription medicines. Although systemic labelling non-compliance was observed throughout the products, this was not associated with their potential to cause harm. Thus, stringency in this area is not necessarily applicable to protecting consumers. More focus should be put on high throughput pharmacovigilance methods that examine immune supplements' compositional integrity and consistency. For this study, the composition of immune supplements was analysed via nuclear magnetic resonance (NMR) spectroscopy using metabolomics. NMR provides detailed ‘snap shots’ into the chemical profile of immune supplements that can be interpreted via multivariate statistics to indicate the consistency of products across numerous batches. Therefore, this thesis aims to provide an overview of the quality and safety of Australian immune supplements. At the same time, it is recognising the place of metabolomics in regulatory environments as a high throughput mechanism of quality assurance

    Outlier concepts auditing methodology for a large family of biomedical ontologies

    Get PDF
    Background: Summarization networks are compact summaries of ontologies. The “Big Picture” view offered by summarization networks enables to identify sets of concepts that are more likely to have errors than control concepts. For ontologies that have outgoing lateral relationships, we have developed the partial-area taxonomy summarization network. Prior research has identified one kind of outlier concepts, concepts of small partials-areas within partial-area taxonomies. Previously we have shown that the small partial-area technique works successfully for four ontologies (or their hierarchies). Methods: To improve the Quality Assurance (QA) scalability, a family-based QA framework, where one QA technique is potentially applicable to a whole family of ontologies with similar structural features, was developed. The 373 ontologies hosted at the NCBO BioPortal in 2015 were classified into a collection of families based on structural features. A meta-ontology represents this family collection, including one family of ontologies having outgoing lateral relationships. The process of updating the current meta-ontology is described. To conclude that one QA technique is applicable for at least half of the members for a family F, this technique should be demonstrated as successful for six out of six ontologies in F. We describe a hypothesis setting the condition required for a technique to be successful for a given ontology. The process of a study to demonstrate such success is described. This paper intends to prove the scalability of the small partial-area technique. Results: We first updated the meta-ontology classifying 566 BioPortal ontologies. There were 371 ontologies in the family with outgoing lateral relationships. We demonstrated the success of the small partial-area technique for two ontology hierarchies which belong to this family, SNOMED CT’s Specimen hierarchy and NCIt’s Gene hierarchy. Together with the four previous ontologies from the same family, we fulfilled the “six out of six” condition required to show the scalability for the whole family. Conclusions: We have shown that the small partial-area technique can be potentially successful for the family of ontologies with outgoing lateral relationships in BioPortal, thus improve the scalability of this QA technique
    corecore