2,278 research outputs found
IdentiCS – Identification of coding sequence and in silico reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence
BACKGROUND: A necessary step for a genome level analysis of the cellular metabolism is the in silico reconstruction of the metabolic network from genome sequences. The available methods are mainly based on the annotation of genome sequences including two successive steps, the prediction of coding sequences (CDS) and their function assignment. The annotation process takes time. The available methods often encounter difficulties when dealing with unfinished error-containing genomic sequence. RESULTS: In this work a fast method is proposed to use unannotated genome sequence for predicting CDSs and for an in silico reconstruction of metabolic networks. Instead of using predicted genes or CDSs to query public databases, entries from public DNA or protein databases are used as queries to search a local database of the unannotated genome sequence to predict CDSs. Functions are assigned to the predicted CDSs simultaneously. The well-annotated genome of Salmonella typhimurium LT2 is used as an example to demonstrate the applicability of the method. 97.7% of the CDSs in the original annotation are correctly identified. The use of SWISS-PROT-TrEMBL databases resulted in an identification of 98.9% of CDSs that have EC-numbers in the published annotation. Furthermore, two versions of sequences of the bacterium Klebsiella pneumoniae with different genome coverage (3.9 and 7.9 fold, respectively) are examined. The results suggest that a 3.9-fold coverage of the bacterial genome could be sufficiently used for the in silico reconstruction of the metabolic network. Compared to other gene finding methods such as CRITICA our method is more suitable for exploiting sequences of low genome coverage. Based on the new method, a program called IdentiCS (Identification of Coding Sequences from Unfinished Genome Sequences) is delivered that combines the identification of CDSs with the reconstruction, comparison and visualization of metabolic networks (free to download at ). CONCLUSIONS: The reversed querying process and the program IdentiCS allow a fast and adequate prediction protein coding sequences and reconstruction of the potential metabolic network from low coverage genome sequences of bacteria. The new method can accelerate the use of genomic data for studying cellular metabolism
Design of bioswitches for synthetic biology
Novel bioswitches are of great interest for synthetic biology, especially when dynamic control of metabolic fluxes is demanded. Among the natural bioswitches, riboswitches and allosteric proteins are of particular importance because of their wide distributions in nature. However, the application of natural bioswitches is often limited by their narrow response range and the engineering of allosteric proteins is challenging due to their dynamic feature and complex mechanisms, especially for non-natural ligands. In this presentation, first an efficient approach that is able to extend the ligand response range of riboswitches will be presented by taking advantage of the computer-aided rational design. A lysine riboswitch from E. coli I has been employed as a model system to demonstrate the procedure. Then, a novel strategy to explore and engineer ligand-induced allosteric regulation based on a new concept of thermodynamic model of protein conformational dynamics will be presented with the aim to create allosteric regulations for non-natural ligands. The key feature of the thermodynamic model is that the allosteric process upon ligand binding is divided into two sub-processes – conformational change and molecular binding. As a consequence, the ligand-induced allosteric regulation can be explored for each sub-process from both energetic and structural perspectives. To prove the concept, aspartokinase III from E. coli was used as a model system. Guided by the thermodynamic model, the natural ligand has been successfully altered from an inhibitor to an activator. Moreover, both inhibition and activation effects have been established for a non-natural ligand
In search of functional association from time-series microarray data based on the change trend and level of gene expression
BACKGROUND: The increasing availability of time-series expression data opens up new possibilities to study functional linkages of genes. Present methods used to infer functional linkages between genes from expression data are mainly based on a point-to-point comparison. Change trends between consecutive time points in time-series data have been so far not well explored. RESULTS: In this work we present a new method based on extracting main features of the change trend and level of gene expression between consecutive time points. The method, termed as trend correlation (TC), includes two major steps: 1, calculating a maximal local alignment of change trend score by dynamic programming and a change trend correlation coefficient between the maximal matched change levels of each gene pair; 2, inferring relationships of gene pairs based on two statistical extraction procedures. The new method considers time shifts and inverted relationships in a similar way as the local clustering (LC) method but the latter is merely based on a point-to-point comparison. The TC method is demonstrated with data from yeast cell cycle and compared with the LC method and the widely used Pearson correlation coefficient (PCC) based clustering method. The biological significance of the gene pairs is examined with several large-scale yeast databases. Although the TC method predicts an overall lower number of gene pairs than the other two methods at a same p-value threshold, the additional number of gene pairs inferred by the TC method is considerable: e.g. 20.5% compared with the LC method and 49.6% with the PCC method for a p-value threshold of 2.7E-3. Moreover, the percentage of the inferred gene pairs consistent with databases by our method is generally higher than the LC method and similar to the PCC method. A significant number of the gene pairs only inferred by the TC method are process-identity or function-similarity pairs or have well-documented biological interactions, including 443 known protein interactions and some known cell cycle related regulatory interactions. It should be emphasized that the overlapping of gene pairs detected by the three methods is normally not very high, indicating a necessity of combining the different methods in search of functional association of genes from time-series data. For a p-value threshold of 1E-5 the percentage of process-identity and function-similarity gene pairs among the shared part of the three methods reaches 60.2% and 55.6% respectively, building a good basis for further experimental and functional study. Furthermore, the combined use of methods is important to infer more complete regulatory circuits and network as exemplified in this study. CONCLUSION: The TC method can significantly augment the current major methods to infer functional linkages and biological network and is well suitable for exploring temporal relationships of gene expression in time-series data
Hierarchical structure and modules in the Escherichia coli transcriptional regulatory network revealed by a new top-down approach
BACKGROUND: Cellular functions are coordinately carried out by groups of genes forming functional modules. Identifying such modules in the transcriptional regulatory network (TRN) of organisms is important for understanding the structure and function of these fundamental cellular networks and essential for the emerging modular biology. So far, the global connectivity structure of TRN has not been well studied and consequently not applied for the identification of functional modules. Moreover, network motifs such as feed forward loop are recently proposed to be basic building blocks of TRN. However, their relationship to functional modules is not clear. RESULTS: In this work we proposed a top-down approach to identify modules in the TRN of E. coli. By studying the global connectivity structure of the regulatory network, we first revealed a five-layer hierarchical structure in which all the regulatory relationships are downward. Based on this regulatory hierarchy, we developed a new method to decompose the regulatory network into functional modules and to identify global regulators governing multiple modules. As a result, 10 global regulators and 39 modules were identified and shown to have well defined functions. We then investigated the distribution and composition of the two basic network motifs (feed forward loop and bi-fan motif) in the hierarchical structure of TRN. We found that most of these network motifs include global regulators, indicating that these motifs are not basic building blocks of modules since modules should not contain global regulators. CONCLUSION: The transcriptional regulatory network of E. coli possesses a multi-layer hierarchical modular structure without feedback regulation at transcription level. This hierarchical structure builds the basis for a new and simple decomposition method which is suitable for the identification of functional modules and global regulators in the transcriptional regulatory network of E. coli. Analysis of the distribution of feed forward loops and bi-fan motifs in the hierarchical structure suggests that these network motifs are not elementary building blocks of functional modules in the transcriptional regulatory network of E. coli
Metabolic peculiarities of Aspergillus niger disclosed by comparative metabolic genomics
A genome-scale metabolic network and an in-depth genomic comparison of Aspergillus niger with seven other fungi is presented, revealing more than 1,100 enzyme-coding genes that are unique to A. niger
Dynamic cumulative activity of transcription factors as a mechanism of quantitative gene regulation
By combining information on the yeast transcription network and high-resolution time-series data with a series of factors, support is provided for the concept that dynamic cumulative regulation is a major principle of quantitative transcriptional control
Is autoinducer-2 a universal signal for interspecies communication: a comparative genomic and phylogenetic analysis of the synthesis and signal transduction pathways
BACKGROUND: Quorum sensing is a process of bacterial cell-to-cell communication involving the production and detection of extracellular signaling molecules called autoinducers. Recently, it has been proposed that autoinducer-2 (AI-2), a furanosyl borate diester derived from the recycling of S-adenosyl-homocysteine (SAH) to homocysteine, serves as a universal signal for interspecies communication. RESULTS: In this study, 138 completed genomes were examined for the genes involved in the synthesis and detection of AI-2. Except for some symbionts and parasites, all organisms have a pathway to recycle SAH, either using a two-step enzymatic conversion by the Pfs and LuxS enzymes or a one-step conversion using SAH-hydrolase (SahH). 51 organisms including most Gamma-, Beta-, and Epsilonproteobacteria, and Firmicutes possess the Pfs-LuxS pathway, while Archaea, Eukarya, Alphaproteobacteria, Actinobacteria and Cyanobacteria prefer the SahH pathway. In all 138 organisms, only the three Vibrio strains had strong, bidirectional matches to the periplasmic AI-2 binding protein LuxP and the central signal relay protein LuxU. The initial two-component sensor kinase protein LuxQ, and the terminal response regulator luxO are found in most Proteobacteria, as well as in some Firmicutes, often in several copies. CONCLUSIONS: The genomic analysis indicates that the LuxS enzyme required for AI-2 synthesis is widespread in bacteria, while the periplasmic binding protein LuxP is only present in Vibrio strains. Thus, other organisms may either use components different from the AI-2 signal transduction system of Vibrio strains to sense the signal of AI-2, or they do not have such a quorum sensing system at all
Lactate based caproate production with Clostridium drakei and process control of Acetobacterium woodii via lactate dependent in situ electrolysis
Syngas fermentation processes with acetogens represent a promising process for the reduction of CO2 emissions alongside bulk chemical production. However, to fully realize this potential the thermodynamic limits of acetogens need to be considered when designing a fermentation process. An adjustable supply of H2 as electron donor plays a key role in autotrophic product formation. In this study an anaerobic laboratory scale continuously stirred tank reactor was equipped with an All-in-One electrode allowing for in-situ H2 generation via electrolysis. Furthermore, this system was coupled to online lactate measurements to control the co-culture of a recombinant lactate-producing Acetobacterium woodii strain and a lactate-consuming Clostridium drakei strain to produce caproate. When C. drakei was grown in batch cultivations with lactate as substrate, 1.6 g·L−1 caproate were produced. Furthermore, lactate production of the A. woodii mutant strain could manually be stopped and reinitiated by controlling the electrolysis. Applying this automated process control, lactate production of the A. woodii mutant strain could be halted to achieve a steady lactate concentration. In a co-culture experiment with the A. woodii mutant strain and the C. drakei strain, the automated process control was able to dynamically react to changing lactate concentrations and adjust H2 formation respectively. This study confirms the potential of C. drakei as medium chain fatty acid producer in a lactate-mediated, autotrophic co-cultivation with an engineered A. woodii strain. Moreover, the monitoring and control strategy presented in this study reinforces the case for autotrophically produced lactate as a transfer metabolite in defined co-cultivations for value-added chemical production
A New Concept to Reveal Protein Dynamics Based on Energy Dissipation
Protein dynamics is essential for its function, especially for intramolecular signal transduction. In this work we propose a new concept, energy dissipation model, to systematically reveal protein dynamics upon effector binding and energy perturbation. The concept is applied to better understand the intramolecular signal transduction during allostery of enzymes. The E. coli allosteric enzyme, aspartokinase III, is used as a model system and special molecular dynamics simulations are designed and carried out. Computational results indicate that the number of residues affected by external energy perturbation (i.e. caused by a ligand binding) during the energy dissipation process shows a sigmoid pattern. Using the two-state Boltzmann equation, we define two parameters, the half response time and the dissipation rate constant, which can be used to well characterize the energy dissipation process. For the allostery of aspartokinase III, the residue response time indicates that besides the ACT2 signal transduction pathway, there is another pathway between the regulatory site and the catalytic site, which is suggested to be the β15-αK loop of ACT1. We further introduce the term “protein dynamical modules” based on the residue response time. Different from the protein structural modules which merely provide information about the structural stability of proteins, protein dynamical modules could reveal protein characteristics from the perspective of dynamics. Finally, the energy dissipation model is applied to investigate E. coli aspartokinase III mutations to better understand the desensitization of product feedback inhibition via allostery. In conclusion, the new concept proposed in this paper gives a novel holistic view of protein dynamics, a key question in biology with high impacts for both biotechnology and biomedicine
- …