53 research outputs found
Development of a suite of bioinformatics tools for the analysis and prediction of membrane protein structure
This thesis describes the development of a novel approach for prediction of the three-dimensional structure of transmembrane regions of membrane proteins directly from amino acid sequence and basic transmembrane region topology.
The development rationale employed involved a knowledge-based approach. Based on determined membrane protein structures, 20x20 association matrices were generated to summarise the distance associations between amino acid side chains on different alpha helical transmembrane regions of membrane proteins. Using these association matrices, combined with a knowledge-based scale for propensity for residue orientation in transmembrane segments (kPROT) (Pilpel et al., 1999), the software predicts the optimal orientations and associations of transmembrane regions and generates a 3D structural model of a gi ven membrane protein, based on the amino acid sequence composition of its transmembrane regions. During the development, several structural and biostatistical analyses of determined membrane protein structures were undertaken with the aim of ensuring a consistent and reliable association matrix upon which to base the predictions.
Evaluation of the model structures obtained for the protein sequences of a dataset of 17 membrane proteins of detennined structure based on cross-validated leave-one-out testing revealed generally high accuracy of prediction, with over 80% of associations between transmembrane regions being correctly predicted. These results provide a promising basis for future development and refinement of the algorithm, and to this end, work is underway using evolutionary computing approaches. As it stands, the approach gives scope for significant immediate benefit to researchers as a valuable starting point in the prediction of structure for membrane proteins of hitherto unknown structure
The genome sequence of Pseudoplusia includens single nucleopolyhedrovirus and an analysis of p26 gene evolution in the baculoviruses
Background: Pseudoplusia includens single nucleopolyhedrovirus (PsinSNPV-IE) is a baculovirus recently identified in our laboratory, with high pathogenicity to the soybean looper, Chrysodeixis includens (Lepidoptera: Noctuidae) (Walker, 1858). In Brazil, the C. includens caterpillar is an emerging pest and has caused significant losses in soybean and cotton crops. The PsinSNPV genome was determined and the phylogeny of the p26 gene within the family Baculoviridae was investigated. Results: The complete genome of PsinSNPV was sequenced (Roche 454 GS FLX – Titanium platform), annotated and compared with other Alphabaculoviruses, displaying a genome apparently different from other baculoviruses so far sequenced. The circular double stranded DNA genome is 139,132 bp in length, with a GC content of 39.3 %
and contains 141 open reading frames (ORFs). PsinSNPV possesses the 37 conserved baculovirus core genes, 102 genes found in other baculoviruses and 2 unique ORFs. Two baculovirus repeat ORFs (bro) homologs, bro-a (Psin33) and bro-b (Psin69), were identified and compared with Chrysodeixis chalcites nucleopolyhedrovirus (ChchNPV) and Trichoplusia ni single nucleopolyhedrovirus (TnSNPV) bro genes and showed high similarity, suggesting that these genes may be derived from an ancestor common to these viruses. The homologous repeats (hrs) are absent from the PsinSNPV genome, which is also the case in ChchNPV and TnSNPV. Two p26 gene homologs (p26a and p26b) were found in the PsinSNPV genome. P26 is thought to be required for optimal virion occlusion in the occlusion bodies (OBs), but its function is not well characterized. The P26 phylogenetic tree suggests that this gene was obtained from three independent acquisition events within the Baculoviridae family. The presence of a signal peptide only in the PsinSNPV p26a/ORF-20 homolog indicates distinct function between the two P26 proteins. Conclusions: PsinSNPV has a genomic sequence apparently different from other baculoviruses sequenced so far. The complete genome sequence of PsinSNPV will provide a valuable resource, contributing to studies on its molecular biology and functional genomics, and will promote the development of this virus as an effective bioinsecticide
Analysis of the Transcriptome in Aspergillus tamarii During Enzymatic Degradation of Sugarcane Bagasse
The production of bioethanol from non-food agricultural residues represents an alternative energy source to fossil fuels for incorporation into the world's economy. Within the context of bioconversion of plant biomass into renewable energy using improved enzymatic cocktails, Illumina RNA-seq transcriptome profiling was conducted on a strain of Aspergillus tamarii, efficient in biomass polysaccharide degradation, in order to identify genes encoding proteins involved in plant biomass saccharification. Enzyme production and gene expression was compared following growth in liquid and semi-solid culture with steam-exploded sugarcane bagasse (SB) (1% w/v) and glucose (1% w/v) employed as contrasting sole carbon sources. Enzyme production following growth in liquid minimum medium supplemented with SB resulted in 0.626 and 0.711 UI.mL−1 xylanases after 24 and 48 h incubation, respectively. Transcriptome profiling revealed expression of over 7120 genes, with groups of genes modulated according to solid or semi-solid culture, as well as according to carbon source. Gene ontology analysis of genes expressed following SB hydrolysis revealed enrichment in xyloglucan metabolic process and xylan, pectin and glucan catabolic process, indicating up-regulation of genes involved in xylanase secretion. According to carbohydrate-active enzyme (CAZy) classification, 209 CAZyme-encoding genes were identified with significant differential expression on liquid or semi-solid SB, in comparison to equivalent growth on glucose as carbon source. Up-regulated CAZyme-encoding genes related to cellulases (CelA, CelB, CelC, CelD) and hemicellulases (XynG1, XynG2, XynF1, XylA, AxeA, arabinofuranosidase) showed up to a 10-fold log2FoldChange in expression levels. Five genes from the AA9 (GH61) family, related to lytic polysaccharide monooxygenase (LPMO), were also identified with significant expression up-regulation. The transcription factor gene XlnR, involved in induction of hemicellulases, showed up-regulation on liquid and semi-solid SB culture. Similarly, the gene ClrA, responsible for regulation of cellulases, showed increased expression on liquid SB culture. Over 150 potential transporter genes were also identified with increased expression on liquid and semi-solid SB culture. This first comprehensive analysis of the transcriptome of A. tamarii contributes to our understanding of genes and regulatory systems involved in cellulose and hemicellulose degradation in this fungus, offering potential for application in improved enzymatic cocktail development for plant biomass degradation in biorefinery applications
Quantitative prey species detection in predator guts across multiple trophic levels by mapping unassembled shotgun reads
Quantifying species trophic interaction strengths is crucial for understanding community dynamics and has significant implications for pest management and species conservation. DNA-based methods to identify species interactions have revolutionized these efforts, but a significant limitation is the poor ability to quantify the strength of trophic interactions, that is the biomass or number of prey consumed. We present an improved pipeline, called Lazaro, to map unassembled shotgun reads to a comprehensive arthropod mitogenome database and show that the number of prey reads detected is quantitatively predicted from the prey biomass consumed, even for indirect predation. Two feeding bioassays were performed: starved coccinellid larvae consuming different numbers of aphids (Prey Quantity bioassay), and starved coccinellid larvae consuming a chrysopid larvae that had consumed aphids (Direct and Indirect Predation bioassay). Prey taxonomic assignment against a mitochondrial genome database had high accuracy (99.8% positive predictive value) and the number of prey reads was directly related to the number of prey consumed and inversely related to the elapsed time since consumption with high significance (r2 = .932, p = 4.92E-6). Aphids were detected up to 6 h after direct predation plus 3 h after indirect predation (9 h in total) and detection was related to the predator-specific decay rates. Lazaro enabled quantitative predictions of prey consumption across multiple trophic levels with high taxonomic resolution while eliminating all false positives, except for a few confirmed contaminants, and may be valuable for characterizing prey consumed by field-sampled predators. Moreover, Lazaro is readily applicable for species diversity determination from any degraded environmental DNA
Release of Insecta mitogenome sequences in GenBank since 1995 (2017 covers only the first three months of the year).
<p>This information was obtained by searching <i>insecta[organism] AND mitochondrion[ti] AND genome[ti]</i> in the Nucleotide database and filtering the results by release date by using the "Release date" search criterion in the left navigation panel.</p
Coverage for <i>C</i>. <i>sanguinea</i> mitogenomes at the indicated consensus positions for the four methods.
<p>Coverage for <i>C</i>. <i>sanguinea</i> mitogenomes at the indicated consensus positions for the four methods.</p
Identity (%) among the mitogenomes generated by the different methods.
<p>Identity (%) among the mitogenomes generated by the different methods.</p
Methods used to obtain and annotate the coccinellid mitogenomes.
<p>Methods used to obtain and annotate the coccinellid mitogenomes.</p
Comparison of the mitogenome sequences of <i>H</i>. <i>axyridis</i> obtained in this work with <i>H</i>. <i>axyridis</i> (KR108208) [38] using standard parameters in MAFFT v7.017 [34].
<p>The KR108208 has 16,387 bp, 13 PCGs, 20 tRNAs (missing <i>trn</i>I and <i>trn</i>Q) and 2 rRNAs. The mitogenomes sequenced here are displayed with highest similarity on top. From the top to the bottom are: the mitogenome obtained by method 2; KR108208; followed by methods 4, 1 and 3, respectively. The annotated mitogenomes are in color, with arrows representing the transcriptional sense of the gene, and small arrows representing tRNA genes. Grey bars represent similar nucleotides and vertical black bars represent dissimilar nucleotides in the target sequence compared to the consensus sequence, and horizontal black lines represent gaps in the target sequence.</p
- …