9,609 research outputs found
Coordinated RNA-Seq and peptidomics identify neuropeptides and G-protein coupled receptors (GPCRs) in the large pine weevil Hylobius abietis, a major forestry pest
Hylobius abietis (Linnaeus), or large pine weevil (Coleoptera, Curculionidae), is a pest of European coniferous forests. In order to gain understanding of the functional physiology of this species, we have assembled a de novo transcriptome of H. abietis, from sequence data obtained by Next Generation Sequencing. In particular, we have identified genes encoding neuropeptides, peptide hormones and their putative G-protein coupled receptors (GPCRs) to gain insights into neuropeptide-modulated processes. The transcriptome was assembled de novo from pooled paired-end, sequence reads obtained from RNA from whole adults, gut and central nervous system tissue samples. Data analysis was performed on the transcripts obtained from the assembly including, annotation, gene ontology and functional assignment as well as transcriptome completeness assessment and KEGG pathway analysis. Pipelines were created using Bioinformatics tools and techniques for prediction and identification of neuropeptides and neuropeptide receptors. Peptidomic analysis was also carried out using a combination of MALDI-TOF as well as Q-Exactive Orbitrap mass spectrometry to confirm the identified neuropeptide. 41 putative neuropeptide families were identified in H. abietis, including Adipokinetic hormone (AKH), CAPA and DH31. Neuropeptide F, which has not been yet identified in the model beetle T. castaneum, was identified. Additionally, 24 putative neuropeptide and 9 leucine-rich repeat containing G protein coupled receptor-encoding transcripts were determined using both alignment as well as non-alignment methods. This information, submitted to the NCBI sequence read archive repository (SRA accession: SRP133355), can now be used to inform understanding of neuropeptide-modulated physiology and behaviour in H. abietis; and to develop specific neuropeptide-based tools for H. abietis control
SpBase: the sea urchin genome database and web site
SpBase is a system of databases focused on the genomic information from sea urchins and related echinoderms. It is exposed to the public through a web site served with open source software (http://spbase.org/). The enterprise was undertaken to provide an easily used collection of information to directly support experimental work on these useful research models in cell and developmental biology. The information served from the databases emerges from the draft genomic sequence of the purple sea urchin, Strongylocentrotus purpuratus and includes sequence data and genomic resource descriptions for other members of the echinoderm clade which in total span 540 million years of evolutionary time. This version of the system contains two assemblies of the purple sea urchin genome, associated expressed sequences, gene annotations and accessory resources. Search mechanisms for the sequences and the gene annotations are provided. Because the system is maintained along with the Sea Urchin Genome resource, a database of sequenced clones is also provided
Gene3D: comprehensive structural and functional annotation of genomes
Gene3D provides comprehensive structural and functional annotation of most available protein sequences, including the UniProt, RefSeq and Integr8 resources. The main structural annotation is generated through scanning these sequences against the CATH structural domain database profile-HMM library. CATH is a database of manually derived PDB-based structural domains, placed within a hierarchy reflecting topology, homology and conservation and is able to infer more ancient and divergent homology relationships than sequence-based approaches. This data is supplemented with Pfam-A, other non-domain structural predictions (i.e. coiled coils) and experimental data from UniProt. In order to enhance the investigations possible with this data, we have also incorporated a variety of protein annotation resources, including protein–protein interaction data, GO functional assignments, KEGG pathways, FUNCAT functional descriptions and links to microarray expression data. All of this data can be accessed through a newly re-designed website that has a focus on flexibility and clarity, with searches that can be restricted to a single genome or across the entire sequence database. Currently Gene3D contains over 3.5 million domain assignments for nearly 5 million proteins including 527 completed genomes. This is available at: http://gene3d.biochem.ucl.ac.uk
Finding the Core-Genes of Chloroplasts
Due to the recent evolution of sequencing techniques, the number of available
genomes is rising steadily, leading to the possibility to make large scale
genomic comparison between sets of close species. An interesting question to
answer is: what is the common functionality genes of a collection of species,
or conversely, to determine what is specific to a given species when compared
to other ones belonging in the same genus, family, etc. Investigating such
problem means to find both core and pan genomes of a collection of species,
\textit{i.e.}, genes in common to all the species vs. the set of all genes in
all species under consideration. However, obtaining trustworthy core and pan
genomes is not an easy task, leading to a large amount of computation, and
requiring a rigorous methodology. Surprisingly, as far as we know, this
methodology in finding core and pan genomes has not really been deeply
investigated. This research work tries to fill this gap by focusing only on
chloroplastic genomes, whose reasonable sizes allow a deep study. To achieve
this goal, a collection of 99 chloroplasts are considered in this article. Two
methodologies have been investigated, respectively based on sequence
similarities and genes names taken from annotation tools. The obtained results
will finally be evaluated in terms of biological relevance
Recommended from our members
Clades of huge phages from across Earth's ecosystems.
Bacteriophages typically have small genomes1 and depend on their bacterial hosts for replication2. Here we sequenced DNA from diverse ecosystems and found hundreds of phage genomes with lengths of more than 200 kilobases (kb), including a genome of 735 kb, which is-to our knowledge-the largest phage genome to be described to date. Thirty-five genomes were manually curated to completion (circular and no gaps). Expanded genetic repertoires include diverse and previously undescribed CRISPR-Cas systems, transfer RNAs (tRNAs), tRNA synthetases, tRNA-modification enzymes, translation-initiation and elongation factors, and ribosomal proteins. The CRISPR-Cas systems of phages have the capacity to silence host transcription factors and translational genes, potentially as part of a larger interaction network that intercepts translation to redirect biosynthesis to phage-encoded functions. In addition, some phages may repurpose bacterial CRISPR-Cas systems to eliminate competing phages. We phylogenetically define the major clades of huge phages from human and other animal microbiomes, as well as from oceans, lakes, sediments, soils and the built environment. We conclude that the large gene inventories of huge phages reflect a conserved biological strategy, and that the phages are distributed across a broad bacterial host range and across Earth's ecosystems
Integrative omics analysis of Pseudomonas aeruginosa virus PA5oct highlights the molecular complexity of jumbo phages
Pseudomonas virus vB_PaeM_PA5oct is proposed as a model jumbo bacteriophage to investigate phage-bacteria interactions and is a candidate for phage therapy applications. Combining hybrid sequencing, RNA-Seq and mass spectrometry allowed us to accurately annotate its 286,783 bp genome with 461 coding regions including four non-coding RNAs (ncRNAs) and 93 virion-associated proteins. PA5oct relies on the host RNA polymerase for the infection cycle and RNA-Seq revealed a gradual take-over of the total cell transcriptome from 21% in early infection to 93% in late infection. PA5oct is not organized into strictly contiguous regions of temporal transcription, but some genomic regions transcribed in early, middle and late phases of infection can be discriminated. Interestingly, we observe regions showing limited transcription activity throughout the infection cycle. We show that PA5oct upregulates specific bacterial operons during infection including operons pncA-pncB1-nadE involved in NAD biosynthesis, psl for exopolysaccharide biosynthesis and nap for periplasmic nitrate reductase production. We also observe a downregulation of T4P gene products suggesting mechanisms of superinfection exclusion. We used the proteome of PA5oct to position our isolate amongst other phages using a gene-sharing network. This integrative omics study illustrates the molecular diversity of jumbo viruses and raises new questions towards cellular regulation and phage-encoded hijacking mechanisms
Bioinformatics tools @ NBBNet: online infrastructure for the management and analysis of biological data
The use of informatics tools for the management and analysis of sequences for nucleic acids and proteins has resulted better throughout capability of wet lab research work to infer biological data to functional biological information. The field of computational biological information management and analysis is generally known as bioinformatics. We discuss some tools and processes which have been developed or integrated into a data management and information presentation pipeline by the Malaysian National
Biotechnology and Bioinformatics Network. Central to this is the Bioinformatics Tools @ NBBnet online infrastructure system. This infrastructure system utilizes grid computing technology. In addition, the deployment of niche databases and database shells for research applying specific datasets
such as a particular protein function, protein family or genomes have been
discussed
A new reference genome assembly for the microcrustacean Daphnia pulex
Comparing genomes of closely related genotypes from populations with distinct demographic histories can help reveal the impact of effective population size on genome evolution. For this purpose, we present a high quality genome assembly of Daphnia pulex (PA42), and compare this with the first sequenced genome of this species (TCO), which was derived from an isolate from a population with >90% reduction in nucleotide diversity. PA42 has numerous similarities to TCO at the gene level, with an average amino acid sequence identity of 98.8 and >60% of orthologous proteins identical. Nonetheless, there is a highly elevated number of genes in the TCO genome annotation, with similar to 7000 excess genes appearing to be false positives. This view is supported by the high GC content, lack of introns, and short length of these suspicious gene annotations. Consistent with the view that reduced effective population size can facilitate the accumulation of slightly deleterious genomic features, we observe more proliferation of transposable elements (TEs) and a higher frequency of gained introns in the TCO genome
ProtSweep, 2Dsweep and DomainSweep: protein analysis suite at DKFZ
The wealth of transcript information that has been made publicly available in recent years has led to large pools of individual web sites offering access to bioinformatics software. However, finding out which services exist, what they can or cannot do, how to use them and how to feed results from one service to the next one in the right format can be very time and resource consuming, especially for non-experts
- …