97 research outputs found

    GOChase-II: correcting semantic inconsistencies from Gene Ontology-based annotations for gene products

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Gene Ontology (GO) provides a controlled vocabulary for describing genes and gene products. In spite of the undoubted importance of GO, several drawbacks associated with GO and GO-based annotations have been introduced. We identified three types of semantic inconsistencies in GO-based annotations; semantically redundant, biological-domain inconsistent and taxonomy inconsistent annotations.</p> <p>Methods</p> <p>To determine the semantic inconsistencies in GO annotation, we used the hierarchical structure of GO graph and tree structure of NCBI taxonomy. Twenty seven biological databases were collected for finding semantic inconsistent annotation.</p> <p>Results</p> <p>The distributions and possible causes of the semantic inconsistencies were investigated using twenty seven biological databases with GO-based annotations. We found that some evidence codes of annotation were associated with the inconsistencies. The numbers of gene products and species in a database that are related to the complexity of database management are also in correlation with the inconsistencies. Consequently, numerous annotation errors arise and are propagated throughout biological databases and GO-based high-level analyses. GOChase-II is developed to detect and correct both syntactic and semantic errors in GO-based annotations.</p> <p>Conclusions</p> <p>We identified some inconsistencies in GO-based annotation and provided software, GOChase-II, for correcting these semantic inconsistencies in addition to the previous corrections for the syntactic errors by GOChase-I.</p

    Clinical MetaData ontology: a simple classification scheme for data elements of clinical data based on semantics

    Get PDF
    Background The increasing use of common data elements (CDEs) in numerous research projects and clinical applications has made it imperative to create an effective classification scheme for the efficient management of these data elements. We applied high-level integrative modeling of entire clinical documents from real-world practice to create the Clinical MetaData Ontology (CMDO) for the appropriate classification and integration of CDEs that are in practical use in current clinical documents. Methods CMDO was developed using the General Formal Ontology method with a manual iterative process comprising five steps: (1) defining the scope of CMDO by conceptualizing its first-level terms based on an analysis of clinical-practice procedures, (2) identifying CMDO concepts for representing clinical data of general CDEs by examining how and what clinical data are generated with flows of clinical care practices, (3) assigning hierarchical relationships for CMDO concepts, (4) developing CMDO properties (e.g., synonyms, preferred terms, and definitions) for each CMDO concept, and (5) evaluating the utility of CMDO. Results We created CMDO comprising 189 concepts under the 4 first-level classes of Description, Event, Finding, and Procedure. CMDO has 256 definitions that cover the 189 CMDO concepts, with 459 synonyms for 139 (74.0%) of the concepts. All of the CDEs extracted from 6 HL7 templates, 25 clinical documents of 5 teaching hospitals, and 1 personal health record specification were successfully annotated by 41 (21.9%), 89 (47.6%), and 13 (7.0%) of the CMDO concepts, respectively. We created a CMDO Browser to facilitate navigation of the CMDO concept hierarchy and a CMDO-enabled CDE Browser for displaying the relationships between CMDO concepts and the CDEs extracted from the clinical documents that are used in current practice. Conclusions CMDO is an ontology and classification scheme for CDEs used in clinical documents. Given the increasing use of CDEs in many studies and real-world clinical documentation, CMDO will be a useful tool for integrating numerous CDEs from different research projects and clinical documents. The CMDO Browser and CMDO-enabled CDE Browser make it easy to search, share, and reuse CDEs, and also effectively integrate and manage CDEs from different studies and clinical documents.This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number:HI18C2386). KHIDI had no participation in the study design or data collection and analysis process. KHIDI did not participate in the writing of the manuscript

    A comparison of food and nutrient intake between instant noodle consumers and non-instant noodle consumers in Korean adults

    Get PDF
    Instant noodles are widely consumed in Asian countries. The Korean population consumed the largest quantity of instant noodles in the world in 2008. However, few studies have investigated the relationship between instant noodles and nutritional status in Koreans. The objective of this study was to examine the association between instant noodle consumption and food and nutrient intake in Korean adults. We used dietary data of 6,440 subjects aged 20 years and older who participated in the Korean National Health and Nutrition Examination Survey III. The average age of the instant noodle consumers (INC) was 36.2 and that of the non-instant noodle consumers (non-INC) was 44.9; men consumed more instant noodles than women (P < 0.001). With the exception of cereals and grain products, legumes, seaweeds, eggs, and milk and dairy products, INC consumed significantly fewer potatoes and starches, sugars, seeds and nuts, vegetables, mushrooms, fruits, seasonings, beverages, meats, fishes, and oils and fats compared with those in the non-INC group. The INC group showed significantly higher nutrient intake of energy, fat, sodium, thiamine, and riboflavin; however, the INC group showed a significantly lower intake of protein, calcium, phosphorus, iron, potassium, vitamin A, niacin, and vitamin C compared with those in the non-INC group. This study revealed that consuming instant noodles may lead to excessive intake of energy, fats, and sodium but may also cause increased intake of thiamine and riboflavin. Therefore, nutritional education helping adults to choose a balanced meal while consuming instant noodles should be implemented. Additionally, instant noodle manufacturers should consider nutritional aspects when developing new products

    Development and Verification of Time-Series Deep Learning for Drug-Induced Liver Injury Detection in Patients Taking Angiotensin II Receptor Blockers: A Multicenter Distributed Research Network Approach

    Get PDF
    Objectives The objective of this study was to develop and validate a multicenter-based, multi-model, time-series deep learning model for predicting drug-induced liver injury (DILI) in patients taking angiotensin receptor blockers (ARBs). The study leveraged a national-level multicenter approach, utilizing electronic health records (EHRs) from six hospitals in Korea. Methods A retrospective cohort analysis was conducted using EHRs from six hospitals in Korea, comprising a total of 10,852 patients whose data were converted to the Common Data Model. The study assessed the incidence rate of DILI among patients taking ARBs and compared it to a control group. Temporal patterns of important variables were analyzed using an interpretable time-series model. Results The overall incidence rate of DILI among patients taking ARBs was found to be 1.09%. The incidence rates varied for each specific ARB drug and institution, with valsartan having the highest rate (1.24%) and olmesartan having the lowest rate (0.83%). The DILI prediction models showed varying performance, measured by the average area under the receiver operating characteristic curve, with telmisartan (0.93), losartan (0.92), and irbesartan (0.90) exhibiting higher classification performance. The aggregated attention scores from the models highlighted the importance of variables such as hematocrit, albumin, prothrombin time, and lymphocytes in predicting DILI. Conclusions Implementing a multicenter-based time-series classification model provided evidence that could be valuable to clinicians regarding temporal patterns associated with DILI in ARB users. This information supports informed decisions regarding appropriate drug use and treatment strategies

    Stratifying non-small cell lung cancer patients using an inverse of the treatment decision rules: validation using electronic health records with application to an administrative database

    Get PDF
    To validate a stratification method using an inverse of treatment decision rules that can classify non-small cell lung cancer (NSCLC) patients in real-world treatment records. (1) To validate the index classifier against the TNM 7th edition, we analyzed electronic health records of NSCLC patients diagnosed from 2011 to 2015 in a tertiary referral hospital in Seoul, Korea. Predictive accuracy, stage-specific sensitivity, specificity, positive predictive value, negative predictive value, F1 score, and c-statistic were measured. (2) To apply the index classifier in an administrative database, we analyzed NSCLC patients in Korean National Health Insurance Database, 2002–2013. Differential survival rates among the classes were examined with the log-rank test, and class-specific survival rates were compared with the reference survival rates. (1) In the validation study (N = 1375), the overall accuracy was 93.8% (95% CI: 92.5–95.0%). Stage-specific c-statistic was the highest for stage I (0.97, 95% CI: 0.96–0.98) and the lowest for stage III (0.82, 95% CI: 0.77–0.87). (2) In the application study (N = 71,593), the index classifier showed a tendency for differentiating survival probabilities among classes. Compared to the reference TNM survival rates, the index classification under-estimated the survival probability for stages IA, IIIB, and IV, and over-estimated it for stages IIA and IIB. The inverse of the treatment decision rules has a potential to supplement a routinely collected database with information encoded in the treatment decision rules to classify NSCLC patients. It requires further validation and replication in multiple clinical settings

    High extinction ratio D-shaped fiber polarizers coated by a double graphene/PMMA stack

    Full text link
    We demonstrate theoretically and experimentally a high extinction ratio and compact size TE-pass polarizer made by a D-shaped fiber coated with a double graphene/PMMA stack. The light propagating in the core of the fiber can be efficiently coupled into the graphene sheet thanks to the giant enhancement of the modal evanescent field associated with the high refractive index graphene/PMMA cladding. The strong interaction between the light and graphene produces a large attenuation difference between modes with orthogonal polarizations, resulting in an improved extinction ratio and a reduced insertion loss due to the device compactness. A double graphene/PMMA stack coated polarizer with an extinction ratio of up to 36 dB and an insertion loss of 5 dB has been achieved when the device length is only 2.5 mm. The double graphene/PMMA stack has proved to be significantly better than single graphene/PMMA stack and bilayer graphene/PMMA structures, providing a polarizer with maximum extinction ratio of 44 dB for a length of 4 mm. The achieved results indicate that the proposed high extinction ratio polarizer is a promising candidate for novel in-fiber graphene-based devices

    A method to decipher pleiotropy by detecting underlying heterogeneity driven by hidden subgroups applied to autoimmune and neuropsychiatric diseases

    Get PDF
    There is growing evidence of shared risk alleles between complex traits (pleiotropy), including autoimmune and neuropsychiatric diseases. This might be due to sharing between all individuals (whole-group pleiotropy), or a subset of individuals within a genetically heterogeneous cohort (subgroup heterogeneity). BUHMBOX is a well-powered statistic distinguishing between these two situations using genotype data. We observed a shared genetic basis between 11 autoimmune diseases and type 1 diabetes (T1D, p0.2, 6,670 T1D cases and 7,279 RA cases). Genetic sharing between seronegative and seropostive RA (p<10−9) had significant evidence of subgroup heterogeneity, suggesting a subgroup of seropositive-like cases within seronegative cases (pBUHMBOX=0.008, 2,406 seronegative RA cases). We also observed a shared genetic basis between major depressive disorder (MDD) and schizophrenia (p<10−4) that was not explained by subgroup heterogeneity (pBUHMBOX=0.28 in 9,238 MDD cases)
    corecore