4,131 research outputs found

    TarO : a target optimisation system for structural biology

    Get PDF
    This work was funded by the UK Biotechnology and Biological Sciences Research Council (BBSRC) Structural Proteomics of Rational Targets (SPoRT) initiative, (Grant BBS/B/14434). Funding to pay the Open Access publication charges for this article was provided by BBSRC.TarO (http://www.compbio.dundee.ac.uk/taro) offers a single point of reference for key bioinformatics analyses relevant to selecting proteins or domains for study by structural biology techniques. The protein sequence is analysed by 17 algorithms and compared to 8 databases. TarO gathers putative homologues, including orthologues, and then obtains predictions of properties for these sequences including crystallisation propensity, protein disorder and post-translational modifications. Analyses are run on a high-performance computing cluster, the results integrated, stored in a database and accessed through a web-based user interface. Output is in tabulated format and in the form of an annotated multiple sequence alignment (MSA) that may be edited interactively in the program Jalview. TarO also simplifies the gathering of additional annotations via the Distributed Annotation System, both from the MSA in Jalview and through links to Dasty2. Routes to other information gateways are included, for example to relevant pages from UniProt, COG and the Conserved Domains Database. Open access to TarO is available from a guest account with private accounts for academic use available on request. Future development of TarO will include further analysis steps and integration with the Protein Information Management System (PIMS), a sister project in the BBSRC Structural Proteomics of Rational Targets initiative.Publisher PDFPeer reviewe

    Differences in transcription between free-living and CO_2-activated third-stage larvae of Haemonchus contortus

    Get PDF
    Background: The disease caused by Haemonchus contortus, a blood-feeding nematode of small ruminants, is of major economic importance worldwide. The infective third-stage larva (L3) of this gastric nematode is enclosed in a cuticle (sheath) and, once ingested with herbage by the host, undergoes an exsheathment process that marks the transition from the free-living (L3) to the parasitic (xL3) stage. This study explored changes in gene transcription associated with this transition and predicted, based on comparative analysis, functional roles for key transcripts in the metabolic pathways linked to larval development. Results: Totals of 101,305 (L3) and 105,553 (xL3) expressed sequence tags (ESTs) were determined using 454 sequencing technology, and then assembled and annotated; the most abundant transcripts encoded transthyretin-like, calcium-binding EF-hand, NAD(P)-binding and nucleotide-binding proteins as well as homologues of Ancylostoma-secreted proteins (ASPs). Using an in silico-subtractive analysis, 560 and 685 sequences were shown to be uniquely represented in the L3 and xL3 stages, respectively; the transcripts encoded ribosomal proteins, collagens and elongation factors (in L3), and mainly peptidases and other enzymes of amino acid catabolism (in xL3). Caenorhabditis elegans orthologues of transcripts that were uniquely transcribed in each L3 and xL3 were predicted to interact with a total of 535 other genes, all of which were involved in embryonic development. Conclusion: The present study indicated that some key transcriptional alterations taking place during the transition from the L3 to the xL3 stage of H. contortus involve genes predicted to be linked to the development of neuronal tissue (L3 and xL3), formation of the cuticle (L3) and digestion of host haemoglobin (xL3). Future efforts using next-generation sequencing and bioinformatic technologies should provide the efficiency and depth of coverage required for the determination of the complete transcriptomes of different developmental stages and/or tissues of H. contortus as well as the genome of this important parasitic nematode. Such advances should lead to a significantly improved understanding of the molecular biology of H. contortus and, from an applied perspective, to novel methods of intervention

    In vitro identification and in silico utilization of interspecies sequence similarities using GeneChip(® )technology

    Get PDF
    BACKGROUND: Genomic approaches in large animal models (canine, ovine etc) are challenging due to insufficient genomic information for these species and the lack of availability of corresponding microarray platforms. To address this problem, we speculated that conserved interspecies genetic sequences can be experimentally detected by cross-species hybridization. The Affymetrix platform probe redundancy offers flexibility in selecting individual probes with high sequence similarities between related species for gene expression analysis. RESULTS: Gene expression profiles of 40 canine samples were generated using the human HG-U133A GeneChip (U133A). Due to interspecies genetic differences, only 14 ± 2% of canine transcripts were detected by U133A probe sets whereas profiling of 40 human samples detected 49 ± 6% of human transcripts. However, when these probe sets were deconstructed into individual probes and examined performance of each probe, we found that 47% of human probes were able to find their targets in canine tissues and generate a detectable hybridization signal. Therefore, we restricted gene expression analysis to these probes and observed the 60% increase in the number of identified canine transcripts. These results were validated by comparison of transcripts identified by our restricted analysis of cross-species hybridization with transcripts identified by hybridization of total lung canine mRNA to new Affymetrix Canine GeneChip(®). CONCLUSION: The experimental identification and restriction of gene expression analysis to probes with detectable hybridization signal drastically increases transcript detection of canine-human hybridization suggesting the possibility of broad utilization of cross-hybridizations of related species using GeneChip technology

    Wide-Scale Analysis of Human Functional Transcription Factor Binding Reveals a Strong Bias towards the Transcription Start Site

    Get PDF
    We introduce a novel method to screen the promoters of a set of genes with shared biological function, against a precompiled library of motifs, and find those motifs which are statistically over-represented in the gene set. The gene sets were obtained from the functional Gene Ontology (GO) classification; for each set and motif we optimized the sequence similarity score threshold, independently for every location window (measured with respect to the TSS), taking into account the location dependent nucleotide heterogeneity along the promoters of the target genes. We performed a high throughput analysis, searching the promoters (from 200bp downstream to 1000bp upstream the TSS), of more than 8000 human and 23,000 mouse genes, for 134 functional Gene Ontology classes and for 412 known DNA motifs. When combined with binding site and location conservation between human and mouse, the method identifies with high probability functional binding sites that regulate groups of biologically related genes. We found many location-sensitive functional binding events and showed that they clustered close to the TSS. Our method and findings were put to several experimental tests. By allowing a "flexible" threshold and combining our functional class and location specific search method with conservation between human and mouse, we are able to identify reliably functional TF binding sites. This is an essential step towards constructing regulatory networks and elucidating the design principles that govern transcriptional regulation of expression. The promoter region proximal to the TSS appears to be of central importance for regulation of transcription in human and mouse, just as it is in bacteria and yeast.Comment: 31 pages, including Supplementary Information and figure

    Polysaccharide utilization loci and nutritional specialization in a dominant group of butyrate-producing human colonic Firmicutes

    Get PDF
    Acknowledgements The Rowett Institute of Nutrition and Health (University of Aberdeen) receives financial support from the Scottish Government Rural and Environmental Sciences and Analytical Services (RESAS). POS is a PhD student supported by the Scottish Government (RESAS) and the Science Foundation Ireland, through a centre award to the APC Microbiome Institute, Cork, Ireland. Data Summary The high-quality draft genomes generated in this work were deposited at the European Nucleotide Archive under the following accession numbers: 1. Eubacterium rectale T1-815; CVRQ01000001–CVRQ0100 0090: http://www.ebi.ac.uk/ena/data/view/PRJEB9320 2. Roseburia faecis M72/1; CVRR01000001–CVRR010001 01: http://www.ebi.ac.uk/ena/data/view/PRJEB9321 3. Roseburia inulinivorans L1-83; CVRS01000001–CVRS0 100 0151: http://www.ebi.ac.uk/ena/data/view/PRJEB9322Peer reviewedPublisher PD

    Bio::Homology::InterologWalk - A Perl module to build putative protein-protein interaction networks through interolog mapping

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interaction (PPI) data are widely used to generate network models that aim to describe the relationships between proteins in biological systems. The fidelity and completeness of such networks is primarily limited by the paucity of protein interaction information and by the restriction of most of these data to just a few widely studied experimental organisms. In order to extend the utility of existing PPIs, computational methods can be used that exploit functional conservation between orthologous proteins across taxa to predict putative PPIs or 'interologs'. To date most interolog prediction efforts have been restricted to specific biological domains with fixed underlying data sources and there are no software tools available that provide a generalised framework for 'on-the-fly' interolog prediction.</p> <p>Results</p> <p>We introduce <monospace>Bio::Homology::InterologWalk</monospace>, a Perl module to retrieve, prioritise and visualise putative protein-protein interactions through an orthology-walk method. The module uses orthology and experimental interaction data to generate putative PPIs and optionally collates meta-data into an Interaction Prioritisation Index that can be used to help prioritise interologs for further analysis. We show the application of our interolog prediction method to the genomic interactome of the fruit fly, <it>Drosophila melanogaster</it>. We analyse the resulting interaction networks and show that the method proposes new interactome members and interactions that are candidates for future experimental investigation.</p> <p>Conclusions</p> <p>Our interolog prediction tool employs the Ensembl Perl API and PSICQUIC enabled protein interaction data sources to generate up to date interologs 'on-the-fly'. This represents a significant advance on previous methods for interolog prediction as it allows the use of the latest orthology and protein interaction data for all of the genomes in Ensembl. The module outputs simple text files, making it easy to customise the results by post-processing, allowing the putative PPI datasets to be easily integrated into existing analysis workflows. The <monospace>Bio::Homology::InterologWalk</monospace> module, sample scripts and full documentation are freely available from the Comprehensive Perl Archive Network (CPAN) under the GNU Public license.</p

    A hidden reservoir of integrative elements is the major source of recently acquired foreign genes and ORFans in archaeal and bacterial genomes

    Get PDF
    A large-scale survey of potential recently acquired integrative elements in 119 archaeal and bacterial genomes reveals that many recently acquired genes have originated from integrative element

    Building and analyzing protein interactome networks by cross-species comparisons

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A genomic catalogue of protein-protein interactions is a rich source of information, particularly for exploring the relationships between proteins. Numerous systems-wide and small-scale experiments have been conducted to identify interactions; however, our knowledge of all interactions for any one species is incomplete, and alternative means to expand these network maps is needed. We therefore took a comparative biology approach to predict protein-protein interactions across five species (human, mouse, fly, worm, and yeast) and developed InterologFinder for research biologists to easily navigate this data. We also developed a confidence score for interactions based on available experimental evidence and conservation across species.</p> <p>Results</p> <p>The connectivity of the resultant networks was determined to have scale-free distribution, small-world properties, and increased local modularity, indicating that the added interactions do not disrupt our current understanding of protein network structures. We show examples of how these improved interactomes can be used to analyze a genome-scale dataset (RNAi screen) and to assign new function to proteins. Predicted interactions within this dataset were tested by co-immunoprecipitation, resulting in a high rate of validation, suggesting the high quality of networks produced.</p> <p>Conclusions</p> <p>Protein-protein interactions were predicted in five species, based on orthology. An InteroScore, a score accounting for homology, number of orthologues with evidence of interactions, and number of unique observations of interactions, is given to each known and predicted interaction. Our website <url>http://www.interologfinder.org</url> provides research biologists intuitive access to this data.</p
    corecore