64 research outputs found

    The interplay between evolution, regulation and tissue specificity in the Human Hereditary Diseasome

    Get PDF
    Background: Human disease genes can be distinguished from essential (embryonically lethal) and non-disease genes using gene attributes. Such attributes include gene age, tissue specificity of expression, regulatory capacity, sequence length, rate of sequence variation and capacity for interaction. The resulting information has been used to inform data mining approaches seeking to identify novel disease genes. Given the dynamic nature of this field and the rapid rise in relevant information, we have chosen to perform a single integrated mining approach to explore relationships among gene attributes and thereby characterise evolutionary trends associated with disease genes.Results: All against all cross comparison of 2,522 disease gene attributes revealed significant relationships existed between the age, disease-association and expression pattern of genes and the tissues within which they are expressed. We found that the over-representation of disease genes among old genes holds for tissue-specific genes, but the correlation between age and disease association vanished when conditioning on tissue-specificity. Of the 32 tissues studied, the genes expressed in pancreas are on average older than the genes expressed in any other tissue, while the testis expressed the lowest proportion of old genes. Following a focussed analysis on the impact of regulatory apparatus on evolution of disease genes, we show that regulators, comprising transcription factors and post-translation modified proteins, are over-represented among ancient disease genes. In addition, we show that the proportion of regulator genes is affected by gene age among disease genes and by tissue-specificity among non-disease genes. Finally, using 55,606 true positive gene interaction data, we find that old disease genes interacts with other old disease genes and interacting new genes interacts with genes originating from higher phylostrata.Conclusion: This study supports the non-random nature of the human diseasome. We have identified a variety of distinct features and correlations to other molecular attributes that can be used to distinguish the set of disease causing genes. This was achieved by harnessing the power of mining large scale datasets from OMIM and other databases. Ultimately such knowledge may contribute to the identification of novel human disease genes and an enhanced understanding of human biology

    Needles in the EST Haystack: Large-Scale Identification and Analysis of Excretory-Secretory (ES) Proteins in Parasitic Nematodes Using Expressed Sequence Tags (ESTs)

    Get PDF
    Excretory-secretory (ES) proteins are an important class of proteins in many organisms, spanning from bacteria to human beings, and are potential drug targets for several diseases. In this study, we first developed a software platform, EST2Secretome, comprised of carefully selected computational tools to identify and analyse ES proteins from expressed sequence tags (ESTs). By employing EST2Secretome, we analysed 4,710 ES proteins derived from 0.5 million ESTs for 39 economically important and disease-causing parasites from the phylum Nematoda. Several known and novel ES proteins that were either parasite- or nematode-specific were discovered, focussing on those that are either absent from or very divergent from similar molecules in their animal or plant hosts. In addition, we found many nematode-specific protein families of domains “transthyretin-like” and “chromadorea ALT,” considered vaccine candidates for filariasis in humans. We report numerous C. elegans homologues with loss-of-function RNAi phenotypes essential for parasite survival and therefore potential targets for parasite intervention. Overall, by developing freely available software to analyse large-scale EST data, we enabled researchers working on parasites for neglected tropical diseases to select specific genes and/or proteins to carry out directed functional assays for demystifying the molecular complexities of host–parasite interactions in a cell

    In silico analysis of expressed sequence tags from Trichostrongylus vitrinus (Nematoda): comparison of the automated ESTExplorer workflow platform with conventional database searches

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The analysis of expressed sequence tags (EST) offers a rapid and cost effective approach to elucidate the transcriptome of an organism, but requires several computational methods for assembly and annotation. Researchers frequently analyse each step manually, which is laborious and time consuming. We have recently developed ESTExplorer, a semi-automated computational workflow system, in order to achieve the rapid analysis of EST datasets. In this study, we evaluated EST data analysis for the parasitic nematode <it>Trichostrongylus vitrinus </it>(order Strongylida) using ESTExplorer, compared with database matching alone.</p> <p>Results</p> <p>We functionally annotated 1776 ESTs obtained <it>via </it>suppressive-subtractive hybridisation from <it>T. vitrinus</it>, an important parasitic trichostrongylid of small ruminants. Cluster and comparative genomic analyses of the transcripts using ESTExplorer indicated that 290 (41%) sequences had homologues in <it>Caenorhabditis elegans</it>, 329 (42%) in parasitic nematodes, 202 (28%) in organisms other than nematodes, and 218 (31%) had no significant match to any sequence in the current databases. Of the <it>C. elegans </it>homologues, 90 were associated with 'non-wildtype' double-stranded RNA interference (RNAi) phenotypes, including embryonic lethality, maternal sterility, sterile progeny, larval arrest and slow growth. We could functionally classify 267 (38%) sequences using the Gene Ontologies (GO) and establish pathway associations for 230 (33%) sequences using the Kyoto Encyclopedia of Genes and Genomes (KEGG). Further examination of this EST dataset revealed a number of signalling molecules, proteases, protease inhibitors, enzymes, ion channels and immune-related genes. In addition, we identified 40 putative secreted proteins that could represent potential candidates for developing novel anthelmintics or vaccines. We further compared the automated EST sequence annotations, using ESTExplorer, with database search results for individual <it>T. vitrinus </it>ESTs. ESTExplorer reliably and rapidly annotated 301 ESTs, with pathway and GO information, eliminating 60 low quality hits from database searches.</p> <p>Conclusion</p> <p>We evaluated the efficacy of ESTExplorer in analysing EST data, and demonstrate that computational tools can be used to accelerate the process of gene discovery in EST sequencing projects. The present study has elucidated sets of relatively conserved and potentially novel genes for biological investigation, and the annotated EST set provides further insight into the molecular biology of <it>T. vitrinus</it>, towards the identification of novel drug targets.</p

    A transcriptomic analysis of the adult stage of the bovine lungworm, Dictyocaulus viviparus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Lungworms of the genus <it>Dictyocaulus </it>(family Dictyocaulidae) are parasitic nematodes of major economic importance. They cause pathological effects and clinical disease in various ruminant hosts, particularly in young animals. <it>Dictyocaulus viviparus</it>, called the bovine lungworm, is a major pathogen of cattle, with severe infections being fatal. In this study, we provide first insights into the transcriptome of the adult stage of <it>D. viviparus </it>through the analysis of expressed sequence tags (ESTs).</p> <p>Results</p> <p>Using our EST analysis pipeline, we estimate that the present dataset of 4436 ESTs is derived from 2258 genes based on cluster and comparative genomic analyses of the ESTs. Of the 2258 representative ESTs, 1159 (51.3%) had homologues in the free-living nematode <it>C. elegans</it>, 1174 (51.9%) in parasitic nematodes, 827 (36.6%) in organisms other than nematodes, and 863 (38%) had no significant match to any sequence in the current databases. Of the <it>C. elegans </it>homologues, 569 had observed 'non-wildtype' RNAi phenotypes, including embryonic lethality, maternal sterility, sterility in progeny, larval arrest and slow growth. We could functionally classify 776 (35%) sequences using the Gene Ontologies (GO) and established pathway associations to 696 (31%) sequences in Kyoto Encyclopedia of Genes and Genomes (KEGG). In addition, we predicted 85 secreted proteins which could represent potential candidates for developing novel anthelmintics or vaccines.</p> <p>Conclusion</p> <p>The bioinformatic analyses of ESTs data for <it>D. viviparus </it>has elucidated sets of relatively conserved and potentially novel genes. The genes discovered in this study should assist research toward a better understanding of the basic molecular biology of <it>D. viviparus</it>, which could lead, in the longer term, to novel intervention strategies. The characterization of the <it>D. viviparus </it>transcriptome also provides a foundation for whole genome sequence analysis and future comparative transcriptomic analyses.</p

    High-throughput functional annotation and data mining with the Blast2GO suite

    Get PDF
    Functional genomics technologies have been widely adopted in the biological research of both model and non-model species. An efficient functional annotation of DNA or protein sequences is a major requirement for the successful application of these approaches as functional information on gene products is often the key to the interpretation of experimental results. Therefore, there is an increasing need for bioinformatics resources which are able to cope with large amount of sequence data, produce valuable annotation results and are easily accessible to laboratories where functional genomics projects are being undertaken. We present the Blast2GO suite as an integrated and biologist-oriented solution for the high-throughput and automatic functional annotation of DNA or protein sequences based on the Gene Ontology vocabulary. The most outstanding Blast2GO features are: (i) the combination of various annotation strategies and tools controlling type and intensity of annotation, (ii) the numerous graphical features such as the interactive GO-graph visualization for gene-set function profiling or descriptive charts, (iii) the general sequence management features and (iv) high-throughput capabilities. We used the Blast2GO framework to carry out a detailed analysis of annotation behaviour through homology transfer and its impact in functional genomics research. Our aim is to offer biologists useful information to take into account when addressing the task of functionally characterizing their sequence data

    Analysis of the F2LR3 (PAR4) single nucleotide polymorphism (rs773902) in an Indigenous Australian population

    Get PDF
    The F2RL3 gene encoding protease activated receptor 4 (PAR4) contains a single nucleotide variant, rs773902, that is functional. The resulting PAR4 variants, Thr120, and Ala120, are known to differently affect platelet reactivity to thrombin. Significant population differences in the frequency of the allele indicate it may be an important determinant in the ethnic differences that exist in thrombosis and hemostasis, and for patient outcomes to PAR antagonist anti-platelet therapies. Here we determined the frequency of rs773902 in an Indigenous Australian group comprising 467 individuals from the Tiwi Islands. These people experience high rates of renal disease that may be related to platelet and PAR4 function and are potential recipients of PAR-antagonist treatments. The rs773902 minor allele frequency (Thr120) in the Tiwi Islanders was 0.32, which is similar to European and Asian groups and substantially lower than Melanesians and some African groups. Logistic regression and allele distortion testing revealed no significant associations between the variant and several markers of renal function, as well as blood glucose and blood pressure. These findings suggest that rs773902 is not an important determinant for renal disease in this Indigenous Australian group. However, the relationships between rs773902 genotype and platelet and drug responsiveness in the Tiwi, and the allele frequency in other Indigenous Australian groups should be evaluated

    Eukaryotic Evolutionary Transitions Are Associated with Extreme Codon Bias in Functionally-Related Proteins

    Get PDF
    Codon bias in the genome of an organism influences its phenome by changing the speed and efficiency of mRNA translation and hence protein abundance. We hypothesized that differences in codon bias, either between-species differences in orthologous genes, or within-species differences between genes, may play an evolutionary role. To explore this hypothesis, we compared the genome-wide codon bias in six species that occupy vital positions in the Eukaryotic Tree of Life. We acquired the entire protein coding sequences for these organisms, computed the codon bias for all genes in each organism and explored the output for relationships between codon bias and protein function, both within- and between-lineages. We discovered five notable coordinated patterns, with extreme codon bias most pronounced in traits considered highly characteristic of a given lineage. Firstly, the Homo sapiens genome had stronger codon bias for DNA-binding transcription factors than the Saccharomyces cerevisiae genome, whereas the opposite was true for ribosomal proteins – perhaps underscoring transcriptional regulation in the origin of complexity. Secondly, both mammalian species examined possessed extreme codon bias in genes relating to hair – a tissue unique to mammals. Thirdly, Arabidopsis thaliana showed extreme codon bias in genes implicated in cell wall formation and chloroplast function – which are unique to plants. Fourthly, Gallus gallus possessed strong codon bias in a subset of genes encoding mitochondrial proteins – perhaps reflecting the enhanced bioenergetic efficiency in birds that co-evolved with flight. And lastly, the G. gallus genome had extreme codon bias for the Ciliary Neurotrophic Factor – which may help to explain their spontaneous recovery from deafness. We propose that extreme codon bias in groups of genes that encode functionally related proteins has a pathway-level energetic explanation

    Genome-wide patterns of promoter sharing and co-expression in bovine skeletal muscle

    Get PDF
    Background: Gene regulation by transcription factors (TF) is species, tissue and time specific. To better understand how the genetic code controls gene expression in bovine muscle we associated gene expression data from developing Longissimus thoracis et lumborum skeletal muscle with bovine promoter sequence information.Results: We created a highly conserved genome-wide promoter landscape comprising 87,408 interactions relating 333 TFs with their 9,242 predicted target genes (TGs). We discovered that the complete set of predicted TGs share an average of 2.75 predicted TF binding sites (TFBSs) and that the average co-expression between a TF and its predicted TGs is higher than the average co-expression between the same TF and all genes. Conversely, pairs of TFs sharing predicted TGs showed a co-expression correlation higher that pairs of TFs not sharing TGs. Finally, we exploited the co-occurrence of predicted TFBS in the context of muscle-derived functionally-coherent modules including cell cycle, mitochondria, immune system, fat metabolism, muscle/glycolysis, and ribosome. Our findings enabled us to reverse engineer a regulatory network of core processes, and correctly identified the involvement of E2F1, GATA2 and NFKB1 in the regulation of cell cycle, fat, and muscle/glycolysis, respectively.Conclusion: The pivotal implication of our research is two-fold: (1) there exists a robust genome-wide expression signal between TFs and their predicted TGs in cattle muscle consistent with the extent of promoter sharing; and (2) this signal can be exploited to recover the cellular mechanisms underpinning transcription regulation of muscle structure and development in bovine. Our study represents the first genome-wide report linking tissue specific co-expression to co-regulation in a non-model vertebrate

    Targeting DNA Damage Response and Replication Stress in Pancreatic Cancer

    Get PDF
    Background and aims: Continuing recalcitrance to therapy cements pancreatic cancer (PC) as the most lethal malignancy, which is set to become the second leading cause of cancer death in our society. The study aim was to investigate the association between DNA damage response (DDR), replication stress and novel therapeutic response in PC to develop a biomarker driven therapeutic strategy targeting DDR and replication stress in PC. Methods: We interrogated the transcriptome, genome, proteome and functional characteristics of 61 novel PC patient-derived cell lines to define novel therapeutic strategies targeting DDR and replication stress. Validation was done in patient derived xenografts and human PC organoids. Results: Patient-derived cell lines faithfully recapitulate the epithelial component of pancreatic tumors including previously described molecular subtypes. Biomarkers of DDR deficiency, including a novel signature of homologous recombination deficiency, co-segregates with response to platinum (P &lt; 0.001) and PARP inhibitor therapy (P &lt; 0.001) in vitro and in vivo. We generated a novel signature of replication stress with which predicts response to ATR (P &lt; 0.018) and WEE1 inhibitor (P &lt; 0.029) treatment in both cell lines and human PC organoids. Replication stress was enriched in the squamous subtype of PC (P &lt; 0.001) but not associated with DDR deficiency. Conclusions: Replication stress and DDR deficiency are independent of each other, creating opportunities for therapy in DDR proficient PC, and post-platinum therapy
    • 

    corecore