9 research outputs found

    Domainoid: domain-oriented orthology inference

    Get PDF
    BACKGROUND: Orthology inference is normally based on full-length protein sequences. However, most proteins contain independently folding and recurring regions, domains. The domain architecture of a protein is vital for its function, and recombination events mean individual domains can have different evolutionary histories. It has previously been shown that orthologous proteins may differ in domain architecture, creating challenges for orthology inference methods operating on full-length sequences. We have developed Domainoid, a new tool aiming to overcome these challenges faced by full-length orthology methods by inferring orthology on the domain level. It employs the InParanoid algorithm on single domains separately, to infer groups of orthologous domains. RESULTS: This domain-oriented approach allows detection of discordant domain orthologs, cases where different domains on the same protein have different evolutionary histories. In addition to domain level analysis, protein level orthology based on the fraction of domains that are orthologous can be inferred. Domainoid orthology assignments were compared to those yielded by the conventional full-length approach InParanoid, and were validated in a standard benchmark. CONCLUSIONS: Our results show that domain-based orthology inference can reveal many orthologous relationships that are not found by full-length sequence approaches. AVAILABILITY: https://bitbucket.org/sonnhammergroup/domainoid/

    Bioinformatics characterization of BcsA-like orphan proteins suggest they form a novel family of pseudomonad cyclic-β-glucan synthases

    Get PDF
    Bacteria produce a variety of polysaccharides with functional roles in cell surface coating, surface and host interactions, and biofilms. We have identified an ‘Orphan’ bacterial cellulose synthase catalytic subunit (BcsA)-like protein found in four model pseudomonads, P. aeruginosa PA01, P. fluorescens SBW25, P. putida KT2440 and P. syringae pv. tomato DC3000. Pairwise alignments indicated that the Orphan and BcsA proteins shared less than 41% sequence identity suggesting they may not have the same structural folds or function. We identified 112 Orphans among soil and plant-associated pseudomonads as well as in phytopathogenic and human opportunistic pathogenic strains. The wide distribution of these highly conserved proteins suggest they form a novel family of synthases producing a different polysaccharide. In silico analysis, including sequence comparisons, secondary structure and topology predictions, and protein structural modelling, revealed a two-domain transmembrane ovoid-like structure for the Orphan protein with a periplasmic glycosyl hydrolase family GH17 domain linked via a transmembrane region to a cytoplasmic glycosyltransferase family GT2 domain. We suggest the GT2 domain synthesises β-(1,3)-glucan that is transferred to the GH17 domain where it is cleaved and cyclised to produce cyclic-β-(1,3)-glucan (CβG). Our structural models are consistent with enzymatic characterisation and recent molecular simulations of the PaPA01 and PpKT2440 GH17 domains. It also provides a functional explanation linking PaPAK and PaPA14 Orphan (also known as NdvB) transposon mutants with CβG production and biofilm-associated antibiotic resistance. Importantly, cyclic glucans are also involved in osmoregulation, plant infection and induced systemic suppression, and our findings suggest this novel family of CβG synthases may provide similar range of adaptive responses for pseudomonads.<br/

    Decoding sequence-level information to predict membrane protein expression

    Get PDF
    The expression and purification of integral membrane proteins remains a major bottleneck in the characterization of these important proteins. Expression levels are currently unpredictable, which renders the pursuit of these targets challenging and highly inefficient. Evidence demonstrates that small changes in the nucleotide or amino-acid sequence can dramatically affect membrane protein biogenesis; yet these observations have not resulted in generalizable approaches to improve expression. In this study, we develop a data-driven statistical model that predicts membrane protein expression in E. coli directly from sequence. The model, trained on experimental data, combines a set of sequence-derived variables resulting in a score that predicts the likelihood of expression. We test the model against various independent datasets from the literature that contain a variety of scales and experimental outcomes demonstrating that the model significantly enriches expressed proteins. The model is then used to score expression for membrane proteomes and protein families highlighting areas where the model excels. Surprisingly, analysis of the underlying features reveals an importance in nucleotide sequence-derived parameters for expression. This computational model, as illustrated here, can immediately be used to identify favorable targets for characterization

    Development of a read mapping analysis software and computational pan genome analysis of 20 Pseudomonas aeruginosa strains

    Get PDF
    Hilker R. Development of a read mapping analysis software and computational pan genome analysis of 20 Pseudomonas aeruginosa strains. Bielefeld: Bielefeld University; 2015.In times of multi-resistant pathogenic bacteria, their detailed study is of utmost importance. Their comparative analysis can even aid the emerging field of personalized medicine by enabling optimized treatment depending on the presence of virulence factors and antibiotic resistances in the infection concerned. The weaknesses and functionality of these pathogenic bacteria can be investigated using modern computer science and novel sequencing technologies. One of these methods is the bioinformatics evaluation of high-throughput sequencing data. A pathogenic bacterium posing severe health care issues is the ubiquitous Pseudomonas aeruginosa. It is involved in a wide range of infections mainly affecting the pulmonary or urinary tract, open wounds and burns. The prevalence of chronic obstructive pulmonary disease cases with P. aeruginosa in Germany alone is ~600,000 per year. Within the framework of this dissertation, computational comparative genomics experiments were conducted with a panel of 20 of the most abundant Pseudomonas aeruginosa strains. 15 of these strains were isolated from clinical cases, while the remaining 5 were strains without a known infection history isolated from the environment. This division was chosen to enable direct comparison of the pathogenic potential of clinical and environmental strains and identification of their possible characteristic differences. When designing the bioinformatics experiments and searching for an efficient visualization and automatic analysis platform for read alignment (mapping) data, it became evident that no adequate solution was available that included all required functionalities. On these grounds, the decision was made to define two main subjects for this dissertation. Besides the P. aeruginosa pan genome analysis, a novel read mapping visualization and analysis software was developed and published in the journal Bioinformatics. This software - ReadXplorer - is partly based upon a prototype, which was developed during a preceding master's thesis at the Center for Biotechnology of the Bielefeld University under the name VAMP. The software was developed into a comprehensive user-friendly platform augmented with several newly developed and implemented automatic bioinformatics read mapping analyses. Two examples of these are the transcription start site detection and the single nucleotide polymorphism detection. Moreover, new intuitive visualizations were added to the existent ones and existing visualizations were greatly enhanced. ReadXplorer is designed to support not only DNA-seq data as accrued in the P. aeruginosa experiments, but also any kind of standard read mapping data as obtained from RNA-seq or ChIP-seq experiments. The data management was designed to comply with the latest performance and efficiency needs emerging from the large next generation sequencing data sets. Finally, ReadXplorer was empowered to deal with eukaryotic read mapping data as well. Amongst other software, ReadXplorer was then used to analyze different comparative genomics aspects of P. aeruginosa and to draw conclusions regarding the development of their pathogenicity. The list of conducted experiments includes phylogeny and gene set determination, analysis of regions of genomic plasticity and identification of single nucleotide polymorphisms. The achieved results were published in the journal Environmental Biology

    Chlorophyllide a Oxidoreductase from Roseobacter denitrificans

    Get PDF
    Im Rahmen der Bakteriochlorophyll a Biosynthese wird die stereo- und regiospezifische Reduktion der C-7/C-8-Doppelbindung von Chlorophyllid a durch die Chlorophyllid a Oxidoreduktase (COR) katalysiert. Die drei COR-Untereinheiten BchX, BchY und BchZ besitzen eine deutliche Aminosäure-Sequenzhomologie zu den entsprechenden Untereinheiten der Nitrogenase und der Licht-unabhängigen Protochlorophyllid a Oxidoreduktase (DPOR). Die DPOR katalysiert die Reduktion der C-17/C-18-Doppelbindung von Protochlorophyllid a. Chlorophyllid a wird durch die Übertragung von zwei Elektronen über zwei Redox-aktive [Fe-S]-Cluster reduziert. Mit Hilfe von Elektronenspinresonanz (EPR)-Spektroskopie wurde für (BchX)2 ein [4Fe-4S]-Cluster nachgewiesen. Mit Hilfe der Gelpermeationschromatographie konnte die heterotetramere Struktur von (BchY/BchZ)2 gezeigt werden. Mittels EPR-Spektroskopie konnten für (BchY/BchZ)2 zwei [4Fe-4S]-Cluster nachgewiesen werden. Durch ortsgerichtete Mutagenese-Experimente in Kombination mit Aktivitätstestuntersuchungen und EPR-Spektroskopie konnte gezeigt werden, dass der [4Fe-4S]-Cluster von (BchY/BchZ)2 durch vier Cystein-Liganden koordiniert wird. Somit unterscheidet sich die COR von der homologen DPOR, welche drei Cystein- und einen Aspartat-Liganden für die Ausbildung eines [4Fe-4S]-Clusters besitzt. In weiteren Mutageneseexperimenten wurde das Liganden-Muster der DPOR auf (BchY/BchZ)2 übertragen. Die entsprechende COR-Doppelmutante zeigte zwar keinerlei enzymatische Aktivität, mittels EPR konnte jedoch die Ausbildung eines artifiziellen [4Fe-4S]-Clusters in Gegenwart eines Aspartat-Liganden nachgewiesen werden. Das unterschiedliche Liganden-Muster der DPOR und COR ist möglichweise verantwortlich für das spezifische Redox-Potential der entsprechenden Redox-aktiven [4Fe-4S]-Cluster. In der vorliegenden Arbeit wurde der ternäre COR-Enzymkomplex, (BchX)2(BchY/BchZ)2(BchX)2, in Anwesenheit des ATP-Analogons, MgADP·AlF4¯ detektiert. MgADP·AlF4¯ ist in der Lage den Übergangszustand der ATP-Hydrolyse nachzuahmen. Somit besitzt die COR vermutlich einen ähnlichen katalytischen Reaktionsmechanismus wie die DPOR. Dabei werden vermutlich unter ATP-Verbrauch die Elektronen über den [4Fe-4S]-Cluster der (BchX)2-Untereinheit über den [4Fe-4S]-Cluster der (BchY/BchZ)2-Untereinheit auf das Substrat übertragen. (BchX)2 nutzt dabei möglicherweise einen Nukleotid-abhängigen "Switch"-Mechanismus für die transiente Ausbildung des ternären Proteinkomplexes.During Bacteriochlorophyll a biosynthesis the stereo- and regiospecific reduction of the C-7/C-8 double bond of chlorophyllide a is catalyzed by the chlorophyllide a oxidoreductase (COR). The three COR subunits BchX, BchY and BchZ share a high degree of amino acid sequence homology to nitrogenase and to light-independant protochlorophyllide a oxidoreductase (DPOR). DPOR catalyzes the reduction of the C-17/C-18 double bond of protochlorophyllide a. Chlorophyllide a is reduced by the transfer of two electrons via the redox-active iron-sulfur-clusters of COR. Electron paramagnetic resonance spectroscopy (EPR) revealed a [4Fe-4S] cluster for the (BchX)2 homodimer. Gel permeation chromatography experiments indicated the heterotetrameric structure of the (BchY/BchZ)2 subkomplex. EPR spectroscopy revealed two [4Fe-4S] clusters on the (BchY/BchZ)2 subunit. Site-directed mutagenesis experiments in combination with a coupled in vitro activity assay and EPR spectroscopy indicated that the [4Fe-4S]-cluster of (BchY/BchZ)2 is coordinated by four cysteine ligands. Therefore, COR differs from closely related DPOR, in which the cluster is ligated by three cysteins and one aspartate. In following mutagenesis experiments a DPOR-like ligation pattern was implemented for (BchY/BchZ)2. The appropriate COR double mutant did not show any enzymatic activtiy, but EPR revealed an artificial [4Fe-4S] cluster. The different ligation pattern of DPOR and COR might be responsible for the specific redox potential of the [4Fe-4S] clusters in the respective enzyme-complexes. In this study the ternary COR complex, (BchX)2(BchY/BchZ)2(BchX)2, was detected in the presence of the ATP analog MgADP·AlF4¯ resembling the transition state of ATP hydrolysis. Therefore, a COR reaction mechanism in analogy to the catalytic reaction mechanism of DPOR was proposed. Individual electrons are transfered via the [4Fe-4S] cluster of (BchX)2 onto the [4Fe-4S] cluster of (BchY/BchZ)2 on the substrate. A nucleotide-dependent switch mechanism of (BchX)2 was proposed
    corecore