876 research outputs found
RSLpred: an integrative system for predicting subcellular localization of rice proteins combining compositional and evolutionary information
The attainment of complete map-based sequence for rice (Oryza sativa) is clearly a major milestone for the research community. Identifying the localization of encoded proteins is the key to understanding their functional characteristics and facilitating their purification. Our proposed method, RSLpred, is an effort in this direction for genome-scale subcellular prediction of encoded rice proteins. First, the support vector machine (SVM)-based modules have been developed using traditional amino acid-, dipeptide- (i+1) and four parts-amino acid composition and achieved an overall accuracy of 81.43, 80.88 and 81.10%, respectively. Secondly, a similarity search-based module has been developed using position-specific iterated-basic local alignment search tool and achieved 68.35% accuracy. Another module developed using evolutionary information of a protein sequence extracted from position-specific scoring matrix achieved an accuracy of 87.10%. In this study, a large number of modules have been developed using various encoding schemes like higher-order dipeptide composition, N- and C-terminal, splitted amino acid composition and the hybrid information. In order to benchmark RSLpred, it was tested on an independent set of rice proteins where it outperformed widely used prediction methods such as TargetP, Wolf-PSORT, PA-SUB, Plant-Ploc and ESLpred. To assist the plant research community, an online web tool 'RSLpred' has been developed for subcellular prediction of query rice proteins, which is freely accessible at http://www.imtech.res.in/raghava/rslpred
Computational Approaches To Improving The Reconstruction Of Metabolic Pathway
Metabolic pathway reconstruction is the essence of systems biology where in silico modeling
and prediction of the cell's function is based on the interaction of the cell's components
represented as a network of reactions. The reconstructed model and the associated database
of information about the organism's genes and their functional roles facilitate a variety of
analysis and simulation techniques that can enrich our understanding. However, there are
unresolved issues for genome-scale metabolic network reconstruction, such as our incomplete
knowledge of the cell's networks for metabolism, transport, and regulation; the completeness,
accuracy, and specificity of the annotation of genomes; and our ability to fully utilise the
available information from -omics (genomics, proteomics, metabolomics, etc) for the reconstruction
of the networks. These issues result in incomplete metabolic models, which limit
our ability to perform analysis of and to make predictions about the cell that are based on
the network model.
This dissertation discusses the state-of-the-art of metabolic pathway reconstruction and highlights
the outstanding issues. In particular, we consider a number of case studies using
genomes of fungi relevant to industrial applications, such as biofuels, to demonstrate the
performance of existing techniques and illustrate the issues. Our case studies focus on the
cell's central metabolism, and the utilisation and transport of sugars as a carbon source,
since these are essential concerns for industrial applications.
A significant deficiency in the existing state-of-the-art for the reconstruction of metabolic
pathways is the ability to associate genes and proteins to the transport reactions that move
specific compounds across the membranes of the cell. The dissertation reviews the state-of-the-
art of prediction methods for transmembrane transport proteins by developing a scheme
to describe and compare existing methods, and applying the existing techniques to the
v
fungal genome of A. niger CBS 513.88. This reveals the split between those methods that
use the Transporter Classification (TC) as their target for prediction, and those that use
the type of chemical substrates being transported as their target. Despite this difficulty in
comparing approaches, it is clear that the state-of-the-art cannot predict specific substrates
being transported, and hence cannot associate genes and proteins to the transport reactions.
The dissertation presents TransATH, which stands for Transporters via ATH (Annotation
Transfer by Homology), a system which automates Saier's protocol and includes the computation
of subcellular localization and improves the computation of transmembrane segments.
The choice of thresholds for the parameters of TransATH is investigated to determine optimal
performance as defined by a gold standard set of transporters and non-transporters from
S. cerevisiae. The dissertation demonstrates TransATH on the fungal genome of A. niger
CBS 513.88 and evaluates the correctness of TransATH using the curated information in
AspGD (the Aspergillus Database). A website for TransATH is available for use
Exploring plant tolerance to biotic and abiotic stresses
Plants are exposed to many stress factors, such as drought, high salinity or pathogens, which reduce the yield of the cultivated plants or affect the quality of the harvested products. Arabidopsis thaliana was used as a model plant to study the responses of plants to different sources of stress. With Agrobacterium T-DNA mediated promoter tagging, a novel di-/tripeptide transporter gene AtPTR3 was identified as a wound-induced gene. This gene was found to be induced by mechanical wounding, high salt concentrations, bacterial infection and senescence, and also in response to several plant hormones and signalling compounds, such as salicylic acid, jasmonic acid, ethylene and abscisic acid. Atptr3 mutants of two Arabidopsis ecotypes, C24 and Col-0, were impaired in germination on media containing a high salt concentration, which indicates that AtPTR3 is involved in seed germination under salt stress. Wounding caused local expression of the AtPTR3 gene, whereas inoculation with the plant pathogenic bacterium Erwinia carotovora subsp. carotovora caused both local and systemic expression of the gene. Atptr3 mutants showed increased susceptibility to infection caused by bacterial phytopathogens, E carotovora and Pseudomonas syringae pv. tomato, and the P. syringae type III secretion system was shown to be involved in suppression of the AtPTR3 expression in inoculated plants. Moreover, the Atptr3 mutation was found to reduce the expression of the marker gene for systemic acquired resistance, PR1 and the mutants accumulated reactive oxygen species (ROS) following the treatment of the plants with ROS generating substances. Overall results and observations suggest that the AtPTR3 is a novel and versatile stress responsive gene needed for defence reactions against many stresses. In a second part of the study, the yeast (Saccharomyces cerevisiae) trehalose-6-phosphate synthase gene (ScTPS1) was utilized to improve the drought tolerance of Arabidopsis. This gene codes for the first enzyme in the trehalose biosynthesis pathway of yeast, and expression in plants leads to improved drought tolerance but also growth aberrations. In this study, the ScTps1 protein was expressed in Arabidopsis using the constructs containing chloroplast targeting transit peptide sequence that facilitated the import of the ScTps1 into the chloroplast. The drought tolerance and growth phenotypes of Arabidopsis transgenics transformed with ScTPS1 with or without transit peptide, were characterized. The plants with cytosolic localization of the ScTps1 protein showed aberrant root phenotype, but the plants with the chloroplast targeted ScTps1 protein caused no aberration in root morphology. Even though both the transgenic lines showed enhanced drought tolerance, the relative water content of the lines was found to be similar to the wild type control. Moreover, both the transgenic lines showed slightly better water holding capacity or reduced water loss over time compared to wild type plants. The overall results indicated that the growth aberrations caused by cytosolic localization of ScTps1 could be uncoupled from the enhanced drought tolerance in the transgenic plants when the ScTps1 was targeted to chloroplast
Genetikk, biosyntese, virknings- og resistensmekanismer for det sirkulĂŠre bakteriosinet garvicin ML
Bacteriocins are ribosomally synthesized antimicrobial peptides, produced by many lactic acid bacteria, which show high promise as antimicrobial agents for use in both food industry and for medical applications. In this work, we have studied the bacteriocin garvicin ML (GarML), which is a head-to-tail ligated circular bacteriocin that has a broad spectrum of activity and is active against a range of pathogenic bacteria. This class of bacteriocins is furthermore attracting interest due to their favourable characteristics for potential industrial use, i.e. high pH and thermal stability in addition to resistance to many proteases. However, there are many aspects of circular bacteriocin biology that are still not known, and in this work, we have attempted to shed light on the processes which govern the biosynthesis, mode of action and resistance to this bacteriocin.
Circular bacteriocins are synthesized with a leader sequence, and maturation of these peptides is thought to occur through three steps: cleavage of the leader sequence, head-to-tail circularization and export out of the cell. However, the mechanisms involved or indeed the enzymes responsible have not yet been characterized. Furthermore, the sequence of events and potential coupling of these processes is unknown. In paper I and II we have sequenced the producer strain of GarML, which allowed identification and characterization of the gene cluster involved in biosynthesis and immunity to GarML. The gene cluster was shown to share several traits, both in genetic organization and in the putative functions of the encoded proteins, with other circular bacteriocin gene clusters. Functional analysis combined with mass spectrometry of deletion mutants of the GarML operons revealed new insights into biosynthesis of GarML, which may thus apply to circular bacteriocins in general. Firstly, we have provided evidence for leader sequence cleavage occurring without subsequent circularization in two knock-out mutants (ÎgarBCDE and garXâ·pCG47), which demonstrates not only that these processes are independent, but that leader sequence cleavage precedes circularization in time (paper II). Furthermore, the evidence suggests that leader sequence cleavage is not performed by any of the proteins encoded by the GarML gene cluster, i.e. garX, garBCDE or garFGH, because we still observe cleavage in their absence (paper II). Two of the operons, namely garX, garBCDE, were implicated in biosynthesis of GarML, specifically in the circularization reaction, as well as providing immunity towards GarML, while the third operon (garFGH) was demonstrated to be non-essential.
For circular bacteriocins it has been and remains a controversial issue whether these peptides require a target receptor or docking molecule like the class Ia lantibiotics and IIa pediocin-like bacteriocins for antimicrobial activity, or whether the peptides interact unspecifically with the target cell membrane to create pores. A few circular bacteriocins have been demonstrated to act on liposomes and/or lipid bilayers, which may indicate that a target receptor is not required, at least at high bacteriocin concentrations. In paper III we however provide evidence for a maltose ABC transporter being implicated in sensitivity to GarML in L. lactis. The deletion of this complex led to 6-11-fold lowered sensitivity to GarML, whereas complementation restored high-level sensitivity to the bacteriocin. However, consistent with other circular bacteriocins, we observe receptor-independent killing at higher concentrations of GarML. These results therefore suggest that this class of bacteriocins may indeed require a specific interaction with a target receptor/mediator for antimicrobial activity at low concentrations.
Resistance mechanisms to bacteriocins, both developed and innate, are poorly understood for many classes of bacteriocins. Gaining insight into these processes is essential in order to be able to minimize resistance, which is an important prerequisite for the potential use of bacteriocins in many applications. In this work, we have demonstrated examples of both adaptive and inherent resistance to GarML. In paper III, we have shown that L. lactis can develop resistance to GarML by loss of the maltose ABC transporter, which occurs at relatively low frequencies (from 10-7 to10-8) compared to adaptive response of class Ia lantibiotics and class IIa pediocin-like bacteriocins. However, no resistance development occurs at high bacteriocin concentrations (>250 BU mL-1), which indicates that killing is receptor-independent above this level (paper III). In paper IV, we have however provided evidence for an inherent resistance mechanism against GarML, which is conserved in a lineage of L. lactis ssp. cremoris strains. This mechanism appears to be specific for GarML, as it does not affect sensitivity towards other bacteriocins targeting lactococci, even including another circular bacteriocin (paper IV). Thus, we have evidence for a new, specific and inherent mechanism of resistance to GarML in this lineage of L. lactis ssp. cremoris strains, which contributes to the understanding of how dissemination of resistance factors leads to intraspecies variations in sensitivity to bacteriocins.Bakteriosiner er ribosomalt syntetiserte antimikrobielle peptider som blant annet produseres av mange melkesyrebakterier, og som har stort potensial som antimikrobielle forbindelser til bruk i matindustri og i medisinske applikasjoner. I dette arbeidet har vi studert bakteriosinet garvicin ML (GarML), som er et peptid med sirkulÊr peptidkjede med bredt aktivitetsspektrum og som er aktivt mot mange patogene bakterier. Denne klassen av bakteriosiner anses som interessante fordi de har egenskaper som gjÞr dem godt egnet til eventuelle industrielle formÄl, dette er blant annet hÞy pH- og temperaturstabilitet i tillegg til resistens mot en rekke proteaser. Det er allikevel flere aspekter ved sirkulÊre bakteriosiner som ikke er tilstrekkelig forstÄtt, og i dette arbeidet har vi Þnsket Ä undersÞke nettopp de prosessene som bestemmer biosyntese, virkningsmekanisme og resistensmekanismer for dette bakteriosinet.
SirkulÊre bakteriosiner syntetiseres med en ledersekvens, og modning av peptidene er antatt Ä omfatte tre steg: klÞyving av ledersekvensen, sirkulering ved ligering av N- og C-terminus, og eksport ut av cellen. Imidlertid er mekanismene involvert og de ansvarlige enzymene ikke kjent. I tillegg er rekkefÞlgen av disse stegene, og de mulige koblingene mellom dem, ennÄ ukjent.
I artikkel I og II har vi sekvensert produsentstammen av GarML, som igjen tillot identifisering og karakterisering av gruppen av gener, bestĂ„ende av fire operoner, som er involvert i biosyntese av og immunitet mot GarML. Denne gruppen av gener ble vist Ă„ ha mye til felles, bĂ„de nĂ„r det gjelder organisering og antatte funksjoner av de proteinene disse genene koder for, med tilsvarende gener for andre sirkulĂŠre bakteriosiner. Funksjonell analyse kombinert med massespektrometri ga ny innsikt i biosyntesen av GarML, som dermed kan gjelde ogsĂ„ for sirkulĂŠre bakteriosiner generelt. FĂžrst og fremst har vi pĂ„vist at klĂžyving av ledersekvensen skjer uten sirkularisering i to knock-out mutanter (ÎgarBCDE and garXâ·pCG47), noe som demonstrerer at disse to prosessene er uavhengige, men ogsĂ„ at klĂžyving skjer forut for sirkularisering i tid (artikkel II). Videre viser resultatene at klĂžyving av ledersekvensen ikke utfĂžres av noen av proteinene som er kodet for i GarML operonene, det vil si garX, garBCDE eller garFGH, fordi man observerer klĂžyving ogsĂ„ uten deres tilstedevĂŠrelse (paper II). To av operonene i gruppen, garX og garBCDE, ble vist Ă„ vĂŠre involvert i biosyntesen av GarML, spesifikt i sirkulariseringsreaksjonen, og samtidig gi immunitet mot GarML, mens et tredje operon (garFGH) ble vist Ă„ vĂŠre ikke-essensielt.
NÄr det gjelder sirkulÊre bakteriosiner, sÄ er det kontroversielt hvorvidt disse peptidene trenger en mÄlreseptor eller et dokking-molekyl for antimikrobiell aktivitet som klasse Ia lantibiotika og IIa pediocin-liknende bakteriosiner eller om de interagerer uspesifikt med cellemembranen for Ä danne porer. I noen tilfeller har det blitt vist at sirkulÊre bakteriosiner virker pÄ lipid bilag og/eller liposomer, noe som kan indikere at et mÄlmolekyl ikke er nÞdvendig, i hvert fall ved hÞye konsentrasjoner av bakteriosin. I artikkel III viser vi derimot at en maltose ABC transporter medvirker til sensitivitet mot GarML i L. lactis. Delesjon av dette komplekset gav 6-11-ganger lavere sensitivitet til GarML, mens komplementering gjenopprettet hÞy sensitivitet til bakteriosinet. Allikevel ble det ved svÊrt hÞye konsentrasjoner av bakteriosin observert reseptor-uavhengig dreping. Disse resultatene indikerer dermed at det ved lave konsentrasjoner av bakteriosin kan vÊre nÞdvendig med en spesifikk interaksjon med et mÄlmolekyl for antimikrobiell aktivitet ogsÄ for denne klassen bakteriosiner.
Resistensmekanismer mot bakteriosiner, bade utviklede og iboende, er ikke godt forstÄtt for mange klasser av bakteriosiner. Det Ä fÄ innsikt i disse prosessene er essensielt for Ä kunne minimere nettopp resistensutvikling, noe som er en forutsetning for den potensielle utnyttelsen av bakteriosiner til ulike formÄl. I dette arbeidet har vi vist eksempler pÄ bÄde utviklet og iboende resistens til GarML. I artikkel III har vi vist at L. lactis kan utvikle resistens mot GarML ved tap av maltose ABC transporter komplekset, som skjer ved en relativt lav frekvens (fra 10-7 til10-8) sammenliknet med utviklet resistens for klasse Ia lantibiotika og klasse IIa pediocin-liknende bakteriosiner. I tillegg ble det ikke observert noen resistensutvikling ved hÞy konsentrasjon av bakteriosin (>250 BU mL-1), noe som indikerer at over dette nivÄet sÄ er drepingen ikke reseptor-mediert. I artikkel IV har vi derimot pÄvist en iboende resistensmekanisme mot GarML som er konservert i en avstamming av L. lactis ssp. cremoris. Denne mekanismen ser ut til Ä vÊre spesifikk for GarML, da den ikke pÄvirker sensitivitet mot andre bakteriosiner som virker mot laktokokker, bl.a. et annet sirkulÊrt bakteriosin. Derav tyder resultatene pÄ at vi har en ny, spesifikk og iboende resistensmekanisme mot GarML i denne avstammingen av L. lactis ssp. cremoris stammer, noe som bidrar til forstÄelsen av hvordan spredning av resistensfaktorer fÞrer til variasjon i sensitivitet mot bakteriosiner innad i arter
Functional genomic analysis of Haemophilus influenzae and application to the study of competence and transformation.
During the progression of this study, hundreds of additional bacterial genomes were sequenced, techniques have evolved, and novel approaches were developed to aid the field of functional and comparative genomics. Some of these techniques were used in this study to identify three novel competence-regulated operons in H. influenzae. The techniques included the use of advanced computer programs and algorithms to assist in predicting protein functions and to facilitate a comparative genomic analysis of H. influenzae with other species of the Pasteurellaceae family. Quantitative PCR was employed to examine the expression of putative transformation-related genes. Finally, PCR-mediated mutagenesis was used in a directed approach to generate mutations in the newly discovered competence-regulated operons to assess their involvement in uptake and transformation of exogenous DNA.The publication of the complete genomic sequence of Haemophilus influenzae Rd KW20 in 1995 was a truly monumental event in molecular biology. For the first time, all of the potential genes of an independent-living organism were known and awaiting functional characterization. This event required the development of fundamentally different methodologies to elucidate gene functions, with systematic global approaches becoming much more feasible. This study describes the development of a transposon-based mutagenesis strategy to facilitate a high-throughput functional analysis of the H. influenzae genome. Mutants created using this strategy were screened in a highly-parallel assay to identify genes mediating transformation in this organism. Additionally, analysis of the transposon insertion sites generated during this study identified a previously unrecognized Tn5 insertion bias
Multicellular Phenotypic Studies of Single Gene Variants in Myxococcus xanthus
There are several systematic methods designed to link genes to cellular processes. These methods are derived from different hypotheses and are largely complementary to each other. This dissertation presents a systematic study of functional genetics and related phenotypes using quantitative methods. The first part of this dissertation will report the successful identification and characterization of 28 genes in the multicellular bacterium Myxococcus xanthus using three different methods: sequence homology, transcription activation and protemoics. The results from this research extended the list of M. xanthus genes involved in multicellularity, and expanded our knowledge regarding the possible molecular pathways underlying physiological and morphological changes.
Although the cellular function of some of the genes in the genome of an organism can be deduced from effects of mutation on phenotype, the disruption or deletion of most genes produces little or no discernible phenotypic impact. The reason for this may be redundancy or complementation, or it may be due to the limitations inherent in available assays. The second part of this dissertation will focus on a population genetics approach to the characterization of phenotype for a collection of mutant strains containing insertion mutations in each of the ~200 ABC transporter component genes in M. xanthus. More than 50% of those mutant strains exhibit at least one phenotypic characteristic that is different from the wild type, and an average of 6% of mutant strains have a gain-of-function phenotype. We also demonstrated that the morphological features used to measure phenotype are not entirely independent variables. These results indicate that a rigorous and quantitative phenotypic characterization will provide significantly more data to understand the phenotypic space of M. xanthus, and that a more rigorous definition of phenotype may help us establish a more accurate connection between genotype and phenotype
Analysis of the Arabidopsis NAC gene superfamily in plant development
There are a vast number of transcription factors that regulate plant growth and development. The NAC gene superfamily is one of the largest families of transcription factors in the plant kingdom. NAC gene expression profiles using Affymetrix ATH1 gene chips were obtained for different plant organs: heart embryo, mature embryo, leaf, root and flower. NAC gene expression profiles proved to be very complex, except for one NAC gene detected only in floral tissue, At1g61110. At1g61110 was shown to be specifically expressed in the anther tapetum of Arabidospis; therefore, its name was changed to TAPNAC. TAPNAC became the focus of our studies. We identified a tapnac T-DNA knockout (KO) line, SALK_069450. A molecular phenotype was observed. Several oligopeptide, sugar and metal transporters were differentially expressed. Coincidentally, a wheat NAC gene, named TaNAM-B1 for its high sequence similarity to ATNAM, TAPNAC and At3g15510 was found to be involved in nutrient remobilization. PHOSPHOLIPASE Dα1 (PLDα1) was also found to be down-regulated in the tapnac KO. PLDα1 is an enzyme which hydrolyzes phospholipids that are part of tapetal cell membranes and tapetal lipid bodies. Once these tapetal cell structures are disrupted, the secretion of the compounds that form part of the pollen coat (i.e. proteins, flavonoids and lipids) into the anther locule is facilitated. Promoter deletion analysis using a GUS reporter and later GUS immuno-localization confirmed the findings of Wellmer and others. TAPNAC is a tapetal specific gene. The cis-regulatory sequence that enhances tapetal expression in the TAPNAC promoter was identified. The consensus motif TCGTGT increased tapetal expression of a GUS reporter gene, only when flanked by the TAPNAC minimal promoter region (-217 bp to +51 bp). In summary, TAPNAC transcription factor has been characterized and data indicates that it could play a role in nutrient remobilization from the tapetum to the pollen grains, particularly during late floral stages. Also, important information on tapetal specifcation cis-regulatory sequences was discovered. The consensus motif TCGTGT, present in TAPNAC promoter, was shown to enhance tapetal expression of a GUS reporter gene
Quantitative and functional post-translational modification proteomics reveals that TREPH1 plays a role in plant thigmomorphogenesis
Plants can sense both intracellular and extracellular mechanical forces and
can respond through morphological changes. The signaling components responsible
for mechanotransduction of the touch response are largely unknown. Here, we
performed a high-throughput SILIA (stable isotope labeling in
Arabidopsis)-based quantitative phosphoproteomics analysis to profile changes
in protein phosphorylation resulting from 40 seconds of force stimulation in
Arabidopsis thaliana. Of the 24 touch-responsive phosphopeptides identified,
many were derived from kinases, phosphatases, cytoskeleton proteins, membrane
proteins and ion transporters. TOUCH-REGULATED PHOSPHOPROTEIN1 (TREPH1) and MAP
KINASE KINASE 2 (MKK2) and/or MKK1 became rapidly phosphorylated in
touch-stimulated plants. Both TREPH1 and MKK2 are required for touch-induced
delayed flowering, a major component of thigmomorphogenesis. The treph1-1 and
mkk2 mutants also exhibited defects in touch-inducible gene expression. A
non-phosphorylatable site-specific isoform of TREPH1 (S625A) failed to restore
touch-induced flowering delay of treph1-1, indicating the necessity of S625 for
TREPH1 function and providing evidence consistent with the possible functional
relevance of the touch-regulated TREPH1 phosphorylation. Bioinformatic analysis
and biochemical subcellular fractionation of TREPH1 protein indicate that it is
a soluble protein. Altogether, these findings identify new protein players in
Arabidopsis thigmomorphogenesis regulation, suggesting that protein
phosphorylation may play a critical role in plant force responses
N-gram analysis of 970 microbial organisms reveals presence of biological language models
<p>Abstract</p> <p>Background</p> <p>It has been suggested previously that genome and proteome sequences show characteristics typical of natural-language texts such as "signature-style" word usage indicative of authors or topics, and that the algorithms originally developed for natural language processing may therefore be applied to genome sequences to draw biologically relevant conclusions. Following this approach of 'biological language modeling', statistical n-gram analysis has been applied for comparative analysis of whole proteome sequences of 44 organisms. It has been shown that a few particular amino acid n-grams are found in abundance in one organism but occurring very rarely in other organisms, thereby serving as genome signatures. At that time proteomes of only 44 organisms were available, thereby limiting the generalization of this hypothesis. Today nearly 1,000 genome sequences and corresponding translated sequences are available, making it feasible to test the existence of biological language models over the evolutionary tree.</p> <p>Results</p> <p>We studied whole proteome sequences of 970 microbial organisms using n-gram frequencies and cross-perplexity employing the Biological Language Modeling Toolkit and Patternix Revelio toolkit. Genus-specific signatures were observed even in a simple unigram distribution. By taking statistical n-gram model of one organism as reference and computing cross-perplexity of all other microbial proteomes with it, cross-perplexity was found to be predictive of branch distance of the phylogenetic tree. For example, a 4-gram model from proteome of <it>Shigellae flexneri 2a</it>, which belongs to the <it>Gammaproteobacteria </it>class showed a self-perplexity of 15.34 while the cross-perplexity of other organisms was in the range of 15.59 to 29.5 and was proportional to their branching distance in the evolutionary tree from <it>S. flexneri</it>. The organisms of this genus, which happen to be pathotypes of <it>E.coli</it>, also have the closest perplexity values with <it>E. coli.</it></p> <p>Conclusion</p> <p>Whole proteome sequences of microbial organisms have been shown to contain particular n-gram sequences in abundance in one organism but occurring very rarely in other organisms, thereby serving as proteome signatures. Further it has also been shown that perplexity, a statistical measure of similarity of n-gram composition, can be used to predict evolutionary distance within a genus in the phylogenetic tree.</p
Genomic and transcriptomic analyses of Microbotryum lychnidis-dioicae provide insights into the biology of a fascinating fungal phytopathogen.
This study made use of the Silene latifolia/Microbotryum lychnidis-dioicae phytopathogen system as the focal system to establish the first reference genome for Microbotryum violaceum sensu lato. In silico analysis was performed on the genome assembly to identify various characteristics of the genome. Using RNA-Sequencing technologies on the Illumina platform, we collected transcriptomic data for both in vitro and in planta life stages of the fungus, providing the most comprehensive look at the gene expression and regulation of this fungus. Due to a lack of identifiable domains on the predicted genes, gene set enrichment analysis was done in context, by including gene sets like âsecreted proteinsâ, âsmall secreted proteinsâ and âunique proteinsâ, to aid discovery of the features in the different datasets. To further research into Microbotryum species in general, we developed, for the first time, a robust and repeatable Agrobacterium-mediated transformation system. Using genomic and transcriptomic data, we were able to select native promoters that drive transcription in specific conditions, making it a highly versatile and controllable system
- âŠ