38 research outputs found
BioCreative III interactive task: an overview
The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested
GWAS Meta-Analysis of Suicide Attempt: Identification of 12 Genome-Wide Significant Loci and Implication of Genetic Risks for Specific Health Factors
OBJECTIVE: Suicidal behavior is heritable and is a major cause of death worldwide. Two large-scale genome-wide association studies (GWASs) recently discovered and cross-validated genome-wide significant (GWS) loci for suicide attempt (SA). The present study leveraged the genetic cohorts from both studies to conduct the largest GWAS meta-analysis of SA to date. Multi-ancestry and admixture-specific meta-analyses were conducted within groups of significant African, East Asian, and European ancestry admixtures.
METHODS: This study comprised 22 cohorts, including 43,871 SA cases and 915,025 ancestry-matched controls. Analytical methods across multi-ancestry and individual ancestry admixtures included inverse variance-weighted fixed-effects meta-analyses, followed by gene, gene-set, tissue-set, and drug-target enrichment, as well as summary-data-based Mendelian randomization with brain expression quantitative trait loci data, phenome-wide genetic correlation, and genetic causal proportion analyses.
RESULTS: Multi-ancestry and European ancestry admixture GWAS meta-analyses identified 12 risk loci at p values \u3c5×10
CONCLUSIONS: This multi-ancestry analysis of suicide attempt identified several loci contributing to risk and establishes significant shared genetic covariation with clinical phenotypes. These findings provide insight into genetic factors associated with suicide attempt across ancestry admixture populations, in veteran and civilian populations, and in attempt versus death
GWAS Meta-Analysis of Suicide Attempt: Identification of 12 Genome-Wide Significant Loci and Implication of Genetic Risks for Specific Health Factors
Objective: Suicidal behavior is heritable and is a major cause of death worldwide. Two large-scale genome-wide association studies (GWASs) recently discovered and crossvalidated genome-wide significant (GWS) loci for suicide attempt (SA). The present study leveraged the genetic cohorts from both studies to conduct the largest GWAS metaanalysis of SA to date. Multi-ancestry and admixture-specific meta-analyses were conducted within groups of significant African, East Asian, and European ancestry admixtures. Methods: This study comprised 22 cohorts, including 43,871 SA cases and 915,025 ancestry-matched controls. Analytical methods across multi-ancestry and individual ancestry admixtures included inverse variance-weighted fixed-effects meta-analyses, followed by gene, gene-set, tissue-set, and drug-target enrichment, as well as summary-data-based Mendelian randomization with brain expression quantitative trait loci data, phenome-wide genetic correlation, and genetic causal proportion analyses. Results: Multi-ancestry and European ancestry admixture GWAS meta-analyses identified 12 risk loci at p values <5×10-8. These loci were mostly intergenic and implicated DRD2, SLC6A9, FURIN, NLGN1, SOX5, PDE4B, and CACNG2. The multi-ancestry SNP-based heritability estimate of SA was 5.7% on the liability scale (SE=0.003, p=5.7×10-80). Significant brain tissue gene expression and drug set enrichment were observed. There was shared genetic variation of SA with attention deficit hyperactivity disorder, smoking, and risk tolerance after conditioning SA on both major depressive disorder and posttraumatic stress disorder. Genetic causal proportion analyses implicated shared genetic risk for specific health factors. Conclusions: This multi-ancestry analysis of suicide attempt identified several loci contributing to risk and establishes significant shared genetic covariation with clinical phenotypes. These findings provide insight into genetic factors associated with suicide attempt across ancestry admixture populations, in veteran and civilian populations, and in attempt versus death.</p
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology
Dissecting the Shared Genetic Architecture of Suicide Attempt, Psychiatric Disorders, and Known Risk Factors
Background Suicide is a leading cause of death worldwide, and nonfatal suicide attempts, which occur far more frequently, are a major source of disability and social and economic burden. Both have substantial genetic etiology, which is partially shared and partially distinct from that of related psychiatric disorders. Methods We conducted a genome-wide association study (GWAS) of 29,782 suicide attempt (SA) cases and 519,961 controls in the International Suicide Genetics Consortium (ISGC). The GWAS of SA was conditioned on psychiatric disorders using GWAS summary statistics via multitrait-based conditional and joint analysis, to remove genetic effects on SA mediated by psychiatric disorders. We investigated the shared and divergent genetic architectures of SA, psychiatric disorders, and other known risk factors. Results Two loci reached genome-wide significance for SA: the major histocompatibility complex and an intergenic locus on chromosome 7, the latter of which remained associated with SA after conditioning on psychiatric disorders and replicated in an independent cohort from the Million Veteran Program. This locus has been implicated in risk-taking behavior, smoking, and insomnia. SA showed strong genetic correlation with psychiatric disorders, particularly major depression, and also with smoking, pain, risk-taking behavior, sleep disturbances, lower educational attainment, reproductive traits, lower socioeconomic status, and poorer general health. After conditioning on psychiatric disorders, the genetic correlations between SA and psychiatric disorders decreased, whereas those with nonpsychiatric traits remained largely unchanged. Conclusions Our results identify a risk locus that contributes more strongly to SA than other phenotypes and suggest a shared underlying biology between SA and known risk factors that is not mediated by psychiatric disorders.Peer reviewe
ATAD3 gene cluster deletions cause cerebellar dysfunction associated with altered mitochondrial DNA and cholesterol metabolism
Although mitochondrial disorders are clinically heterogeneous, they frequently involve the central nervous system and are among the most common neurogenetic disorders. Identifying the causal genes has benefited enormously from advances in high-throughput sequencing technologies; however, once the defect is known, researchers face the challenge of deciphering the underlying disease mechanism. Here we characterize large biallelic deletions in the region encoding the ATAD3C, ATAD3B and ATAD3A genes. Although high homology complicates genomic analysis of the ATAD3 defects, they can be identified by targeted analysis of standard single nucleotide polymorphism array and whole exome sequencing data. We report deletions that generate chimeric ATAD3B/ATAD3A fusion genes in individuals from four unrelated families with fatal congenital pontocerebellar hypoplasia, whereas a case with genomic rearrangements affecting the ATAD3C/ATAD3B genes on one allele and ATAD3B/ATAD3A genes on the other displays later-onset encephalopathy with cerebellar atrophy, ataxia and dystonia. Fibroblasts from affected individuals display mitochondrial DNA abnormalities, associated with multiple indicators of altered cholesterol metabolism. Moreover, drug-induced perturbations of cholesterol homeostasis cause mitochondrial DNA disorganization in control cells, while mitochondrial DNA aggregation in the genetic cholesterol trafficking disorder Niemann-Pick type C disease further corroborates the interdependence of mitochondrial DNA organization and cholesterol. These data demonstrate the integration of mitochondria in cellular cholesterol homeostasis, in which ATAD3 plays a critical role. The dual problem of perturbed cholesterol metabolism and mitochondrial dysfunction could be widespread in neurological and neurodegenerative diseases
Airway sizes and proportions in children quantified by a video-bronchoscopic technique
Background: A quantitative understanding of airway sizes and proportions and a reference point for comparisons are important to a bronchoscopist. The aims of this study were to measure large airway areas, and define proportions and predictors of airway size in children. Methods: A validated videobronchoscope technique was used to measure in-vivo airway cross-sectional areas (cricoid, right (RMS) and left (LMS) main stem and major lobar bronchi) of 125 children. Airway proportions were calculated as ratios of airways to cricoid areas and to endotracheal tube (ETT) areas. Mann Whitney U, T-tests, and one-way ANOVA were used for comparisons and standard univariate and backwards, stepwise multivariate regression analyses were used to define airway size predictors. Results: Airways size increased progressively with increasing age but proportions remained constant. The LMS was 21% smaller than the RMS. Gender differences in airways' size were not significant in any age group or airway site. Cricoid area related best to body length (BL): cricoid area (mm2) = 26.782 + 0.254*BL (cm) while the RMS and LMS area related best to weight: RMS area (mm2) = 23.938 + 0.394*Wt (kg) and LMS area (mm2) = 20.055 + 0.263*Wt (kg) respectively. Airways to cricoid ratios were larger than airway to ETT ratios (p=0.0001). Conclusions: The cricoid and large airways progressively increase in size but maintain constant proportional relationships to the cricoid across childhood. The cricoid area correlates with body length while the RMS and LMS are best predicted by weight. These data provide for quantitative comparisons of airway lesions
BioCreative III interactive task: an overview
Background:
The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested.
Results:
A User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation.
Discussion: The IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge