48 research outputs found

    ‘Genome design’ model and multicellular complexity: golden middle

    Get PDF
    Human tissue-specific genes were reported to be longer than housekeeping genes (both in coding and intronic parts). The competing neutralist and adaptationist models were proposed to explain this observation. Here I show that in human genome the longest are genes with the intermediate expression pattern. From the standpoint of information theory, the regulation of such genes should be most complex. In the genomewide context, they are found here to have the higher informational load on all available levels: from participation in protein interaction networks, pathways and modules reflected in Gene Ontology categories through transcription factor regulatory sets and protein functional domains to amino acid tuples (words) in encoded proteins and nucleotide tuples in introns and promoter regions. Thus, the intermediately expressed genes have the higher functional and regulatory complexity that is reflected in their greater length (which is consistent with the ‘genome design’ model). The dichotomy of housekeeping versus tissue-specific entities is more pronounced on the modular level than on the molecular level. There are much lesser intermediate-specific modules (modules overrepresented in the intermediately expressed genes) than housekeeping or tissue-specific modules (normalized to gene number). The dichotomy of housekeeping versus tissue-specific genes and modules in multicellular organisms is probably caused by the burden of regulatory complexity acted on the intermediately expressed genes

    An Analytically Solvable Model for Rapid Evolution of Modular Structure

    Get PDF
    Biological systems often display modularity, in the sense that they can be decomposed into nearly independent subsystems. Recent studies have suggested that modular structure can spontaneously emerge if goals (environments) change over time, such that each new goal shares the same set of sub-problems with previous goals. Such modularly varying goals can also dramatically speed up evolution, relative to evolution under a constant goal. These studies were based on simulations of model systems, such as logic circuits and RNA structure, which are generally not easy to treat analytically. We present, here, a simple model for evolution under modularly varying goals that can be solved analytically. This model helps to understand some of the fundamental mechanisms that lead to rapid emergence of modular structure under modularly varying goals. In particular, the model suggests a mechanism for the dramatic speedup in evolution observed under such temporally varying goals

    Enhanced production yields of rVSV-SARS-CoV-2 vaccine using Fibra-Cel® macrocarriers

    Get PDF
    The COVID-19 pandemic has led to high global demand for vaccines to safeguard public health. To that end, our institute has developed a recombinant viral vector vaccine utilizing a modified vesicular stomatitis virus (VSV) construct, wherein the G protein of VSV is replaced with the spike protein of SARS-CoV-2 (rVSV-ΔG-spike). Previous studies have demonstrated the production of a VSV-based vaccine in Vero cells adsorbed on Cytodex 1 microcarriers or in suspension. However, the titers were limited by both the carrier surface area and shear forces. Here, we describe the development of a bioprocess for rVSV-ΔG-spike production in serum-free Vero cells using porous Fibra-Cel® macrocarriers in fixed-bed BioBLU®320 5p bioreactors, leading to high-end titers. We identified core factors that significantly improved virus production, such as the kinetics of virus production, the use of macrospargers for oxygen supply, and medium replenishment. Implementing these parameters, among others, in a series of GMP production processes improved the titer yields by at least two orders of magnitude (2e9 PFU/mL) over previously reported values. The developed process was highly effective, repeatable, and robust, creating potent and genetically stable vaccine viruses and introducing new opportunities for application in other viral vaccine platforms

    A Measure of the Promiscuity of Proteins and Characteristics of Residues in the Vicinity of the Catalytic Site That Regulate Promiscuity

    Get PDF
    Promiscuity, the basis for the evolution of new functions through ‘tinkering’ of residues in the vicinity of the catalytic site, is yet to be quantitatively defined. We present a computational method Promiscuity Indices Estimator (PROMISE) - based on signatures derived from the spatial and electrostatic properties of the catalytic residues, to estimate the promiscuity (PromIndex) of proteins with known active site residues and 3D structure. PromIndex reflects the number of different active site signatures that have congruent matches in close proximity of its native catalytic site, the quality of the matches and difference in the enzymatic activity. Promiscuity in proteins is observed to follow a lognormal distribution (μ = 0.28, σ = 1.1 reduced chi-square = 3.0E-5). The PROMISE predicted promiscuous functions in any protein can serve as the starting point for directed evolution experiments. PROMISE ranks carboxypeptidase A and ribonuclease A amongst the more promiscuous proteins. We have also investigated the properties of the residues in the vicinity of the catalytic site that regulates its promiscuity. Linear regression establishes a weak correlation (R2∼0.1) between certain properties of the residues (charge, polar, etc) in the neighborhood of the catalytic residues and PromIndex. A stronger relationship states that most proteins with high promiscuity have high percentages of charged and polar residues within a radius of 3 Å of the catalytic site, which is validated using one-tailed hypothesis tests (P-values∼0.05). Since it is known that these characteristics are key factors in catalysis, their relationship with the promiscuity index cross validates the methodology of PROMISE

    Heat shock partially dissociates the overlapping modules of the yeast protein-protein interaction network: a systems level model of adaptation

    Get PDF
    Network analysis became a powerful tool in recent years. Heat shock is a well-characterized model of cellular dynamics. S. cerevisiae is an appropriate model organism, since both its protein-protein interaction network (interactome) and stress response at the gene expression level have been well characterized. However, the analysis of the reorganization of the yeast interactome during stress has not been investigated yet. We calculated the changes of the interaction-weights of the yeast interactome from the changes of mRNA expression levels upon heat shock. The major finding of our study is that heat shock induced a significant decrease in both the overlaps and connections of yeast interactome modules. In agreement with this the weighted diameter of the yeast interactome had a 4.9-fold increase in heat shock. Several key proteins of the heat shock response became centers of heat shock-induced local communities, as well as bridges providing a residual connection of modules after heat shock. The observed changes resemble to a "stratus-cumulus" type transition of the interactome structure, since the unstressed yeast interactome had a globally connected organization, similar to that of stratus clouds, whereas the heat shocked interactome had a multifocal organization, similar to that of cumulus clouds. Our results showed that heat shock induces a partial disintegration of the global organization of the yeast interactome. This change may be rather general occurring in many types of stresses. Moreover, other complex systems, such as single proteins, social networks and ecosystems may also decrease their inter-modular links, thus develop more compact modules, and display a partial disintegration of their global structure in the initial phase of crisis. Thus, our work may provide a model of a general, system-level adaptation mechanism to environmental changes.Comment: 24 pages, 6 figures, 2 tables, 70 references + 22 pages 8 figures, 4 tables and 8 references in the enclosed Supplemen

    Evolution of protein domain architectures

    Get PDF
    This chapter reviews current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this will directly impact which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multi-domain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). We end by a discussion of some available tools for computational analysis or exploitation of protein domain architectures and their evolution

    Dynamics and Adaptive Benefits of Protein Domain Emergence and Arrangements during Plant Genome Evolution

    Get PDF
    Plant genomes are generally very large, mostly paleopolyploid, and have numerous gene duplicates and complex genomic features such as repeats and transposable elements. Many of these features have been hypothesized to enable plants, which cannot easily escape environmental challenges, to rapidly adapt. Another mechanism, which has recently been well described as a major facilitator of rapid adaptation in bacteria, animals, and fungi but not yet for plants, is modular rearrangement of protein-coding genes. Due to the high precision of profile-based methods, rearrangements can be well captured at the protein level by characterizing the emergence, loss, and rearrangements of protein domains, their structural, functional, and evolutionary building blocks. Here, we study the dynamics of domain rearrangements and explore their adaptive benefit in 27 plant and 3 algal genomes. We use a phylogenomic approach by which we can explain the formation of 88% of all arrangements by single-step events, such as fusion, fission, and terminal loss of domains. We find many domains are lost along every lineage, but at least 500 domains are novel, that is, they are unique to green plants and emerged more or less recently. These novel domains duplicate and rearrange more readily within their genomes than ancient domains and are overproportionally involved in stress response and developmental innovations. Novel domains more often affect regulatory proteins and show a higher degree of structural disorder than ancient domains. Whereas a relatively large and well-conserved core set of single-domain proteins exists, long multi-domain arrangements tend to be species-specific. We find that duplicated genes are more often involved in rearrangements. Although fission events typically impact metabolic proteins, fusion events often create new signaling proteins essential for environmental sensing. Taken together, the high volatility of single domains and complex arrangements in plant genomes demonstrate the importance of modularity for environmental adaptability of plants

    The Diversification of the LIM Superclass at the Base of the Metazoa Increased Subcellular Complexity and Promoted Multicellular Specialization

    Get PDF
    Background: Throughout evolution, the LIM domain has been deployed in many different domain configurations, which has led to the formation of a large and distinct group of proteins. LIM proteins are involved in relaying stimuli received at the cell surface to the nucleus in order to regulate cell structure, motility, and division. Despite their fundamental roles in cellular processes and human disease, little is known about the evolution of the LIM superclass. Results: We have identified and characterized all known LIM domain-containing proteins in six metazoans and three nonmetazoans. In addition, we performed a phylogenetic analysis on all LIM domains and, in the process, have identified a number of novel non-LIM domains and motifs in each of these proteins. Based on these results, we have formalized a classification system for LIM proteins, provided reasonable timing for class and family origin events; and identified lineagespecific loss events. Our analysis is the first detailed description of the full set of LIM proteins from the non-bilaterian species examined in this study. Conclusion: Six of the 14 LIM classes originated in the stem lineage of the Metazoa. The expansion of the LIM superclass at the base of the Metazoa undoubtedly contributed to the increase in subcellular complexity required for the transition from a unicellular to multicellular lifestyle and, as such, was a critically important event in the history of animal multicellularity

    Triangle network motifs predict complexes by complementing high-error interactomes with structural information

    Get PDF
    BackgroundA lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles.ResultsWe find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes.ConclusionGiven high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN

    Exportation of Monkeypox Virus From the African Continent.

    Get PDF
    BACKGROUND: The largest West African monkeypox outbreak began September 2017, in Nigeria. Four individuals traveling from Nigeria to the United Kingdom (n = 2), Israel (n = 1), and Singapore (n = 1) became the first human monkeypox cases exported from Africa, and a related nosocomial transmission event in the United Kingdom became the first confirmed human-to-human monkeypox transmission event outside of Africa. METHODS: Epidemiological and molecular data for exported and Nigerian cases were analyzed jointly to better understand the exportations in the temporal and geographic context of the outbreak. RESULTS: Isolates from all travelers and a Bayelsa case shared a most recent common ancestor and traveled to Bayelsa, Delta, or Rivers states. Genetic variation for this cluster was lower than would be expected from a random sampling of genomes from this outbreak, but data did not support direct links between travelers. CONCLUSIONS: Monophyly of exportation cases and the Bayelsa sample, along with the intermediate levels of genetic variation, suggest a small pool of related isolates is the likely source for the exported infections. This may be the result of the level of genetic variation present in monkeypox isolates circulating within the contiguous region of Bayelsa, Delta, and Rivers states, or another more restricted, yet unidentified source pool
    corecore