151 research outputs found

    Law of Genome Evolution Direction : Coding Information Quantity Grows

    Full text link
    The problem of the directionality of genome evolution is studied. Based on the analysis of C-value paradox and the evolution of genome size we propose that the function-coding information quantity of a genome always grows in the course of evolution through sequence duplication, expansion of code, and gene transfer from outside. The function-coding information quantity of a genome consists of two parts, p-coding information quantity which encodes functional protein and n-coding information quantity which encodes other functional elements except amino acid sequence. The evidences on the evolutionary law about the function-coding information quantity are listed. The needs of function is the motive force for the expansion of coding information quantity and the information quantity expansion is the way to make functional innovation and extension for a species. So, the increase of coding information quantity of a genome is a measure of the acquired new function and it determines the directionality of genome evolution.Comment: 16 page

    A probabilistic framework to predict protein function from interaction data integrated with semantic knowledge

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The functional characterization of newly discovered proteins has been a challenge in the post-genomic era. Protein-protein interactions provide insights into the functional analysis because the function of unknown proteins can be postulated on the basis of their interaction evidence with known proteins. The protein-protein interaction data sets have been enriched by high-throughput experimental methods. However, the functional analysis using the interaction data has a limitation in accuracy because of the presence of the false positive data experimentally generated and the interactions that are a lack of functional linkage.</p> <p>Results</p> <p>Protein-protein interaction data can be integrated with the functional knowledge existing in the Gene Ontology (GO) database. We apply similarity measures to assess the functional similarity between interacting proteins. We present a probabilistic framework for predicting functions of unknown proteins based on the functional similarity. We use the leave-one-out cross validation to compare the performance. The experimental results demonstrate that our algorithm performs better than other competing methods in terms of prediction accuracy. In particular, it handles the high false positive rates of current interaction data well.</p> <p>Conclusion</p> <p>The experimentally determined protein-protein interactions are erroneous to uncover the functional associations among proteins. The performance of function prediction for uncharacterized proteins can be enhanced by the integration of multiple data sources available.</p

    The radial arrangement of the human chromosome 7 in the lymphocyte cell nucleus is associated with chromosomal band gene density

    Get PDF
    This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ Springer-Verlag 2008.In the nuclei of human lymphocytes, chromosome territories are distributed according to the average gene density of each chromosome. However, chromosomes are very heterogeneous in size and base composition, and can contain both very gene-dense and very gene-poor regions. Thus, a precise analysis of chromosome organisation in the nuclei should consider also the distribution of DNA belonging to the chromosomal bands in each chromosome. To improve our understanding of the chromatin organisation, we localised chromosome 7 DNA regions, endowed with different gene densities, in the nuclei of human lymphocytes. Our results showed that this chromosome in cell nuclei is arranged radially with the gene-dense/GC-richest regions exposed towards the nuclear interior and the gene-poorest/GC-poorest ones located at the nuclear periphery. Moreover, we found that chromatin fibres from the 7p22.3 and the 7q22.1 bands are not confined to the territory of the bulk of this chromosome, protruding towards the inner part of the nucleus. Overall, our work demonstrates the radial arrangement of the territory of chromosome 7 in the lymphocyte nucleus and confirms that human genes occupy specific radial positions, presumably to enhance intra- and inter-chromosomal interaction among loci displaying a similar expression pattern, and/or similar replication timing

    Curation of complex, context-dependent immunological data

    Get PDF
    BACKGROUND: The Immune Epitope Database and Analysis Resource (IEDB) is dedicated to capturing, housing and analyzing complex immune epitope related data . DESCRIPTION: To identify and extract relevant data from the scientific literature in an efficient and accurate manner, novel processes were developed for manual and semi-automated annotation. CONCLUSION: Formalized curation strategies enable the processing of a large volume of context-dependent data, which are now available to the scientific community in an accessible and transparent format. The experiences described herein are applicable to other databases housing complex biological data and requiring a high level of curation expertise

    Large publishing consortia produce higher citation impact research but co-author contributions are hard to evaluate

    Get PDF
    This paper introduces a simple agglomerative clustering method to identify large publishing consortia with at least 20 authors and 80% shared authorship between articles. Based on Scopus journal articles 1996-2018, under these criteria, nearly all (88%) of the large consortia published research with citation impact above the world average, with the exceptions being mainly the newer consortia for which average citation counts are unreliable. On average, consortium research had almost double (1.95) the world average citation impact on the log scale used (Mean Normalised Log Citation Score). At least partial alphabetical author ordering was the norm in most consortia. The 250 largest consortia were for nuclear physics and astronomy around expensive equipment, and for predominantly health-related issues in genomics, medicine, public health, microbiology and neuropsychology. For the health-related issues, except for the first and last few authors, authorship seem to primary indicate contributions to the shared project infrastructure necessary to gather the raw data. It is impossible for research evaluators to identify the contributions of individual authors in the huge alphabetical consortia of physics and astronomy, and problematic for the middle and end authors of health-related consortia. For small scale evaluations, authorship contribution statements could be used, when available

    Chromosome-level assembly of the water buffalo genome surpasses human and goat genomes in sequence contiguity

    Get PDF
    Rapid innovation in sequencing technologies and improvement in assembly algorithms have enabled the creation of highly contiguous mammalian genomes. Here we report a chromosome-level assembly of the water buffalo (Bubalus bubalis) genome using single-molecule sequencing and chromatin conformation capture data. PacBio Sequel reads, with a mean length of 11.5 kb, helped to resolve repetitive elements and generate sequence contiguity. All five B. bubalis sub-metacentric chromosomes were correctly scaffolded with centromeres spanned. Although the index animal was partly inbred, 58% of the genome was haplotype-phased by FALCON-Unzip. This new reference genome improves the contig N50 of the previous short-read based buffalo assembly more than a thousand-fold and contains only 383 gaps. It surpasses the human and goat references in sequence contiguity and facilitates the annotation of hard to assemble gene clusters such as the major histocompatibility complex (MHC)

    Genetic data: The new challenge of personalized medicine, insights for rheumatoid arthritis patients

    Get PDF
    Rapid advances in genotyping technology, analytical methods, and the establishment of large cohorts for population genetic studies have resulted in a large new body of information about the genetic basis of human rheumatoid arthritis (RA). Improved understanding of the root pathogenesis of the disease holds the promise of improved diagnostic and prognostic tools based upon this information. In this review, we summarize the nature of new genetic findings in human RA, including susceptibility loci and gene-gene and gene-environment interactions, as well as genetic loci associated with sub-groups of patients and those associated with response to therapy. Possible uses of these data are discussed, such as prediction of disease risk as well as personalized therapy and prediction of therapeutic response and risk of adverse events. While these applications are largely not refined to the point of clinical utility in RA, it seems likely that multi-parameter datasets including genetic, clinical, and biomarker data will be employed in the future care of RA patients

    CpG island hypermethylation-associated silencing of non-coding RNAs transcribed from ultraconserved regions in human cancer

    Get PDF
    Although only 1.5% of the human genome appears to code for proteins, much effort in cancer research has been devoted to this minimal fraction of our DNA. However, the last few years have witnessed the realization that a large class of non-coding RNAs (ncRNAs), named microRNAs, contribute to cancer development and progression by acting as oncogenes or tumor suppressor genes. Recent studies have also shown that epigenetic silencing of microRNAs with tumor suppressor features by CpG island hypermethylation is a common hallmark of human tumors. Thus, we wondered whether there were other ncRNAs undergoing aberrant DNA methylation-associated silencing in transformed cells. We focused on the transcribed-ultraconserved regions (T-UCRs), a subset of DNA sequences that are absolutely conserved between orthologous regions of the human, rat and mouse genomes and that are located in both intra- and intergenic regions. We used a pharmacological and genomic approach to reveal the possible existence of an aberrant epigenetic silencing pattern of T-UCRs by treating cancer cells with a DNA-demethylating agent followed by hybridization to an expression microarray containing these sequences. We observed that DNA hypomethylation induces release of T-UCR silencing in cancer cells. Among the T-UCRs that were reactivated upon drug treatment, Uc.160+, Uc283+A and Uc.346+ were found to undergo specific CpG island hypermethylation-associated silencing in cancer cells compared with normal tissues. The analysis of a large set of primary human tumors (n=283) demonstrated that hypermethylation of the described T-UCR CpG islands was a common event among the various tumor types. Our finding that, in addition to microRNAs, another class of ncRNAs (T-UCRs) undergoes DNA methylation-associated inactivation in transformed cells supports a model in which epigenetic and genetic alterations in coding and non-coding sequences cooperate in human tumorigenesis
    corecore