9,609 research outputs found

    Coordinated RNA-Seq and peptidomics identify neuropeptides and G-protein coupled receptors (GPCRs) in the large pine weevil Hylobius abietis, a major forestry pest

    Get PDF
    Hylobius abietis (Linnaeus), or large pine weevil (Coleoptera, Curculionidae), is a pest of European coniferous forests. In order to gain understanding of the functional physiology of this species, we have assembled a de novo transcriptome of H. abietis, from sequence data obtained by Next Generation Sequencing. In particular, we have identified genes encoding neuropeptides, peptide hormones and their putative G-protein coupled receptors (GPCRs) to gain insights into neuropeptide-modulated processes. The transcriptome was assembled de novo from pooled paired-end, sequence reads obtained from RNA from whole adults, gut and central nervous system tissue samples. Data analysis was performed on the transcripts obtained from the assembly including, annotation, gene ontology and functional assignment as well as transcriptome completeness assessment and KEGG pathway analysis. Pipelines were created using Bioinformatics tools and techniques for prediction and identification of neuropeptides and neuropeptide receptors. Peptidomic analysis was also carried out using a combination of MALDI-TOF as well as Q-Exactive Orbitrap mass spectrometry to confirm the identified neuropeptide. 41 putative neuropeptide families were identified in H. abietis, including Adipokinetic hormone (AKH), CAPA and DH31. Neuropeptide F, which has not been yet identified in the model beetle T. castaneum, was identified. Additionally, 24 putative neuropeptide and 9 leucine-rich repeat containing G protein coupled receptor-encoding transcripts were determined using both alignment as well as non-alignment methods. This information, submitted to the NCBI sequence read archive repository (SRA accession: SRP133355), can now be used to inform understanding of neuropeptide-modulated physiology and behaviour in H. abietis; and to develop specific neuropeptide-based tools for H. abietis control

    SpBase: the sea urchin genome database and web site

    Get PDF
    SpBase is a system of databases focused on the genomic information from sea urchins and related echinoderms. It is exposed to the public through a web site served with open source software (http://spbase.org/). The enterprise was undertaken to provide an easily used collection of information to directly support experimental work on these useful research models in cell and developmental biology. The information served from the databases emerges from the draft genomic sequence of the purple sea urchin, Strongylocentrotus purpuratus and includes sequence data and genomic resource descriptions for other members of the echinoderm clade which in total span 540 million years of evolutionary time. This version of the system contains two assemblies of the purple sea urchin genome, associated expressed sequences, gene annotations and accessory resources. Search mechanisms for the sequences and the gene annotations are provided. Because the system is maintained along with the Sea Urchin Genome resource, a database of sequenced clones is also provided

    Gene3D: comprehensive structural and functional annotation of genomes

    Get PDF
    Gene3D provides comprehensive structural and functional annotation of most available protein sequences, including the UniProt, RefSeq and Integr8 resources. The main structural annotation is generated through scanning these sequences against the CATH structural domain database profile-HMM library. CATH is a database of manually derived PDB-based structural domains, placed within a hierarchy reflecting topology, homology and conservation and is able to infer more ancient and divergent homology relationships than sequence-based approaches. This data is supplemented with Pfam-A, other non-domain structural predictions (i.e. coiled coils) and experimental data from UniProt. In order to enhance the investigations possible with this data, we have also incorporated a variety of protein annotation resources, including protein–protein interaction data, GO functional assignments, KEGG pathways, FUNCAT functional descriptions and links to microarray expression data. All of this data can be accessed through a newly re-designed website that has a focus on flexibility and clarity, with searches that can be restricted to a single genome or across the entire sequence database. Currently Gene3D contains over 3.5 million domain assignments for nearly 5 million proteins including 527 completed genomes. This is available at: http://gene3d.biochem.ucl.ac.uk

    Finding the Core-Genes of Chloroplasts

    Full text link
    Due to the recent evolution of sequencing techniques, the number of available genomes is rising steadily, leading to the possibility to make large scale genomic comparison between sets of close species. An interesting question to answer is: what is the common functionality genes of a collection of species, or conversely, to determine what is specific to a given species when compared to other ones belonging in the same genus, family, etc. Investigating such problem means to find both core and pan genomes of a collection of species, \textit{i.e.}, genes in common to all the species vs. the set of all genes in all species under consideration. However, obtaining trustworthy core and pan genomes is not an easy task, leading to a large amount of computation, and requiring a rigorous methodology. Surprisingly, as far as we know, this methodology in finding core and pan genomes has not really been deeply investigated. This research work tries to fill this gap by focusing only on chloroplastic genomes, whose reasonable sizes allow a deep study. To achieve this goal, a collection of 99 chloroplasts are considered in this article. Two methodologies have been investigated, respectively based on sequence similarities and genes names taken from annotation tools. The obtained results will finally be evaluated in terms of biological relevance

    Integrative omics analysis of Pseudomonas aeruginosa virus PA5oct highlights the molecular complexity of jumbo phages

    Get PDF
    Pseudomonas virus vB_PaeM_PA5oct is proposed as a model jumbo bacteriophage to investigate phage-bacteria interactions and is a candidate for phage therapy applications. Combining hybrid sequencing, RNA-Seq and mass spectrometry allowed us to accurately annotate its 286,783 bp genome with 461 coding regions including four non-coding RNAs (ncRNAs) and 93 virion-associated proteins. PA5oct relies on the host RNA polymerase for the infection cycle and RNA-Seq revealed a gradual take-over of the total cell transcriptome from 21% in early infection to 93% in late infection. PA5oct is not organized into strictly contiguous regions of temporal transcription, but some genomic regions transcribed in early, middle and late phases of infection can be discriminated. Interestingly, we observe regions showing limited transcription activity throughout the infection cycle. We show that PA5oct upregulates specific bacterial operons during infection including operons pncA-pncB1-nadE involved in NAD biosynthesis, psl for exopolysaccharide biosynthesis and nap for periplasmic nitrate reductase production. We also observe a downregulation of T4P gene products suggesting mechanisms of superinfection exclusion. We used the proteome of PA5oct to position our isolate amongst other phages using a gene-sharing network. This integrative omics study illustrates the molecular diversity of jumbo viruses and raises new questions towards cellular regulation and phage-encoded hijacking mechanisms

    Bioinformatics tools @ NBBNet: online infrastructure for the management and analysis of biological data

    Get PDF
    The use of informatics tools for the management and analysis of sequences for nucleic acids and proteins has resulted better throughout capability of wet lab research work to infer biological data to functional biological information. The field of computational biological information management and analysis is generally known as bioinformatics. We discuss some tools and processes which have been developed or integrated into a data management and information presentation pipeline by the Malaysian National Biotechnology and Bioinformatics Network. Central to this is the Bioinformatics Tools @ NBBnet online infrastructure system. This infrastructure system utilizes grid computing technology. In addition, the deployment of niche databases and database shells for research applying specific datasets such as a particular protein function, protein family or genomes have been discussed

    A new reference genome assembly for the microcrustacean Daphnia pulex

    Get PDF
    Comparing genomes of closely related genotypes from populations with distinct demographic histories can help reveal the impact of effective population size on genome evolution. For this purpose, we present a high quality genome assembly of Daphnia pulex (PA42), and compare this with the first sequenced genome of this species (TCO), which was derived from an isolate from a population with >90% reduction in nucleotide diversity. PA42 has numerous similarities to TCO at the gene level, with an average amino acid sequence identity of 98.8 and >60% of orthologous proteins identical. Nonetheless, there is a highly elevated number of genes in the TCO genome annotation, with similar to 7000 excess genes appearing to be false positives. This view is supported by the high GC content, lack of introns, and short length of these suspicious gene annotations. Consistent with the view that reduced effective population size can facilitate the accumulation of slightly deleterious genomic features, we observe more proliferation of transposable elements (TEs) and a higher frequency of gained introns in the TCO genome

    ProtSweep, 2Dsweep and DomainSweep: protein analysis suite at DKFZ

    Get PDF
    The wealth of transcript information that has been made publicly available in recent years has led to large pools of individual web sites offering access to bioinformatics software. However, finding out which services exist, what they can or cannot do, how to use them and how to feed results from one service to the next one in the right format can be very time and resource consuming, especially for non-experts
    corecore