196 research outputs found
The look-ahead effect of phenotypic mutations
The evolution of complex molecular traits such as disulphide bridges often
requires multiple mutations. The intermediate steps in such evolutionary
trajectories are likely to be selectively neutral or deleterious. Therefore,
large populations and long times may be required to evolve such traits. We
propose that errors in transcription and translation may allow selection for
the intermediate mutations if the final trait provides a large enough selective
advantage. We test this hypothesis using a population based model of protein
evolution. If an individual acquires one of two mutations needed for a novel
trait, the second mutation can be introduced into the phenotype due to
transcription and translation errors. If the novel trait is advantageous
enough, the allele with only one mutation will spread through the population,
even though the gene sequence does not yet code for the complete trait. The
first mutation then has a higher frequency than expected without phenotypic
mutations giving the second mutation a higher probability of fixation. Thus,
errors allow protein sequences to ''look-ahead'' for a more direct path to a
complex trait.Comment: Submitted to "Genetics
Resilient Learning-Based Control for Synchronization of Passive Multi-Agent Systems under Attack
In this paper, we show synchronization for a group of output passive agents
that communicate with each other according to an underlying communication graph
to achieve a common goal. We propose a distributed event-triggered control
framework that will guarantee synchronization and considerably decrease the
required communication load on the band-limited network. We define a general
Byzantine attack on the event-triggered multi-agent network system and
characterize its negative effects on synchronization. The Byzantine agents are
capable of intelligently falsifying their data and manipulating the underlying
communication graph by altering their respective control feedback weights. We
introduce a decentralized detection framework and analyze its steady-state and
transient performances. We propose a way of identifying individual Byzantine
neighbors and a learning-based method of estimating the attack parameters.
Lastly, we propose learning-based control approaches to mitigate the negative
effects of the adversarial attack
Finding common protein interaction patterns across organisms
Protein interactions are an important resource to obtain an understanding of cell function. Recently, researchers have compared networks of interactions in order to understand network evolution. While current methods first infer homologs and then compare topologies, we here present a method which first searches for interesting topologies and then looks for homologs. PINA (protein interaction network analysis) takes the protein interaction networks of two organisms, scans both networks for subnetworks deemed interesting, and then tries to find orthologs among the interesting subnetworks. The application is very fast because orthology investigations are restricted to subnetworks like hubs and clusters that fulfill certain criteria regarding neighborhood and connectivity. Finally, the hubs or clusters found to be related can be visualized and analyzed according to protein annotation
The Evolution of Protein Interaction Networks in Regulatory Proteins
Interactions between proteins are essential for intracellular communication. They
form complex networks which have become an important source for functional
analysis of proteins. Combining phylogenies with network analysis, we investigate
the evolutionary history of interaction networks from the bHLH, NR and bZIP
transcription-factor families. The bHLH and NR networks show a hub-like structure
with varying γ values. Mutation and gene duplication play an important role
in adding and removing interactions. We conclude that in several of the protein
families that we have studied, networks have primarily arisen by the development of
heterodimerizing transcription factors, from an ancestral gene which interacts with
any of the newly emerging proteins but also homodimerizes
Evolutionary divergence and limits of conserved non-coding sequence detection in plant genomes
The discovery of regulatory motifs embedded in upstream regions of plants is a particularly challenging bioinformatics task. Previous studies have shown that motifs in plants are short compared with those found in vertebrates. Furthermore, plant genomes have undergone several diversification mechanisms such as genome duplication events which impact the evolution of regulatory motifs. In this article, a systematic phylogenomic comparison of upstream regions is conducted to further identify features of the plant regulatory genomes, the component of genomes regulating gene expression, to enable future de novo discoveries. The findings highlight differences in upstream region properties between major plant groups and the effects of divergence times and duplication events. First, clear differences in upstream region evolution can be detected between monocots and dicots, thus suggesting that a separation of these groups should be made when searching for novel regulatory motifs, particularly since universal motifs such as the TATA box are rare. Second, investigating the decay rate of significantly aligned regions suggests that a divergence time of 100 mya sets a limit for reliable conserved non-coding sequence (CNS) detection. Insights presented here will set a framework to help identify embedded motifs of functional relevance by understanding the limits of bioinformatics detection for CNSs.</p
Evolutionary Dynamics on Protein Bi-stability Landscapes Can Potentially Resolve Adaptive Conflicts
Experimental studies have shown that some proteins exist in two alternative native-state conformations. It has been proposed that such bi-stable proteins can potentially function as evolutionary bridges at the interface between two neutral networks of protein sequences that fold uniquely into the two different native conformations. Under adaptive conflict scenarios, bi-stable proteins may be of particular advantage if they simultaneously provide two beneficial biological functions. However, computational models that simulate protein structure evolution do not yet recognize the importance of bi-stability. Here we use a biophysical model to analyze sequence space to identify bi-stable or multi-stable proteins with two or more equally stable native-state structures. The inclusion of such proteins enhances phenotype connectivity between neutral networks in sequence space. Consideration of the sequence space neighborhood of bridge proteins revealed that bi-stability decreases gradually with each mutation that takes the sequence further away from an exactly bistable protein. With relaxed selection pressures, we found that bi-stable proteins in our model are highly successful under simulated adaptive conflict. Inspired by these model predictions, we developed a method to identify real proteins in the PDB with bridge-like properties, and have verified a clear bi-stability gradient for a series of mutants studied by Alexander et al. (Proc Nat Acad Sci USA 2009, 106:21149–21154) that connect two sequences that fold uniquely into two different native structures via a bridge-like intermediate mutant sequence. Based on these findings, new testable predictions for future studies on protein bi-stability and evolution are discussed
MDAT- Aligning multiple domain arrangements
Background: Proteins are composed of domains, protein segments that fold independently from the rest of the protein and have a specific function. During evolution the arrangement of domains can change: domains are gained, lost or their order is rearranged. To facilitate the analysis of these changes we propose the use of multiple domain alignments. Results: We developed an alignment program, called MDAT, which aligns multiple domain arrangements. MDAT extends earlier programs which perform pairwise alignments of domain arrangements. MDAT uses a domain similarity matrix to score domain pairs and aligns the domain arrangements using a consistency supported progressive alignment method. Conclusion: MDAT will be useful for analysing changes in domain arrangements within and between protein families and will thus provide valuable insights into the evolution of proteins and their domains. MDAT is coded in C++, and the source code is freely available for download at http://www.bornberglab.org/pages/mda
Evaluating Characteristics of De Novo Assembly Software on 454 Transcriptome Data: A Simulation Approach
Background: The quantity of transcriptome data is rapidly increasing for non-model organisms. As sequencing technology advances, focus shifts towards solving bioinformatic challenges, of which sequence read assembly is the first task. Recent studies have compared the performance of different software to establish a best practice for transcriptome assembly. Here, we adapted a simulation approach to evaluate specific features of assembly programs on 454 data. The novelty of our study is that the simulation allows us to calculate a model assembly as reference point for comparison. Findings: The simulation approach allows us to compare basic metrics of assemblies computed by different software applications (CAP3, MIRA, Newbler, and Oases) to a known optimal solution. We found MIRA and CAP3 are conservative in merging reads. This resulted in comparably high number of short contigs. In contrast, Newbler more readily merged reads into longer contigs, while Oases produced the overall shortest assembly. Due to the simulation approach, reads could be traced back to their correct placement within the transcriptome. Together with mapping reads onto the assembled contigs, we were able to evaluate ambiguity in the assemblies. This analysis further supported the conservative nature of MIRA and CAP3, which resulted in low proportions of chimeric contigs, but high redundancy. Newbler produced less redundancy, but the proportion of chimeric contigs was higher. Conclusion: Our evaluation of four assemblers suggested that MIRA and Newbler slightly outperformed the othe
Reduction/oxidation-phosphorylation control of DNA binding in the bZIP dimerization network
BACKGROUND: bZIPs are transcription factors that are found throughout the eukarya from fungi to flowering plants and mammals. They contain highly conserved basic region (BR) and leucine zipper (LZ) domains and often function as environmental sensors. Specifically, bZIPs frequently have a role in mediating the response to oxidative stress, a crucial environmental signal that needs to be transduced to the gene regulatory network. RESULTS: Based on sequence comparisons and experimental data on a number of important bZIP transcription factors, we predict which bZIPs are under redox control and which are regulated via protein phosphorylation. By integrating genomic, phylogenetic and functional data from the literature, we then propose a link between oxidative stress and the choice of interaction partners for the bZIP proteins. CONCLUSION: This integration permits the bZIP dimerization network to be interpreted in functional terms, especially in the context of the role of bZIP proteins in the response to environmental stress. This analysis demonstrates the importance of abiotic factors in shaping regulatory networks
Domain similarity based orthology detection
Background: Orthologous protein detection software mostly uses pairwise comparisons of amino-acid sequences to assert whether two proteins are orthologous or not. Accordingly, when the number of sequences for comparison increases, the number of comparisons to compute grows in a quadratic order. A current challenge of bioinformatic research, especially when taking into account the increasing number of sequenced organisms available, is to make this ever-growing number of comparisons computationally feasible in a reasonable amount of time. We propose to speed up the detection of orthologous proteins by using strings of domains to characterize the proteins. Results: We present two new protein similarity measures, a cosine and a maximal weight matching score based on domain content similarity, and new software, named porthoDom. The qualities of the cosine and the maximal weight matching similarity measures are compared against curated datasets. The measures show that domain content similarities are able to correctly group proteins into their families. Accordingly, the cosine similarity measure is used inside porthoDom, the wrapper developed for proteinortho. porthoDom makes use of domain content similarity measures to group proteins together before searching for orthologs. By using domains instead of amino acid sequences, the reduction of the search space decreases the computational complexity of an all-against-all sequence comparison. Conclusion: We demonstrate that representing and comparing proteins as strings of discrete domains, i.e. as a concatenation of their unique identifiers, allows a drastic simplification of search space. porthoDom has the advantage of speeding up orthology detection while maintaining a degree of accuracy similar to proteinortho. The implementation of porthoDom is released using python and C++ languages and is available under the GNU GPL licence 3 at http://www.bornberglab.org/pages/porthoda.<br
- …