26 research outputs found
Buffered Qualitative Stability explains the robustness and evolvability of transcriptional networks
The gene regulatory network (GRN) is the central decisionâmaking module of the cell. We have developed a theory called Buffered Qualitative Stability (BQS) based on the hypothesis that GRNs are organised so that they remain robust in the face of unpredictable environmental and evolutionary changes. BQS makes strong and diverse predictions about the network features that allow stable responses under arbitrary perturbations, including the random addition of new connections. We show that the GRNs of E. coli, M. tuberculosis, P. aeruginosa, yeast, mouse, and human all verify the predictions of BQS. BQS explains many of the small- and largeâscale properties of GRNs, provides conditions for evolvable robustness, and highlights general features of transcriptional response. BQS is severely compromised in a human cancer cell line, suggesting that loss of BQS might underlie the phenotypic plasticity of cancer cells, and highlighting a possible sequence of GRN alterations concomitant with cancer initiation. DOI: http://dx.doi.org/10.7554/eLife.02863.00
Universal attenuators and their interactions with feedback loops in gene regulatory networks
Using a combination of mathematical modelling, statistical simulation and large-scale data analysis we study the properties of linear regulatory chains (LRCs) within gene regulatory networks (GRNs). Our modelling indicates that downstream genes embedded within LRCs are highly insulated from the variation in expression of upstream genes, and thus LRCs act as attenuators. This observation implies a progressively weaker functionality of LRCs as their length increases. When analyzing the preponderance of LRCs in the GRNs of Escherichia coli K12 and several other organisms, we find that very long LRCs are essentially absent. In both E. coli and M. tuberculosis we find that four-gene LRCs are intimately linked to identical feedback loops that are involved in potentially chaotic stress response, indicating that the dynamics of these potentially destabilising motifs are strongly restrained under homeostatic conditions. The same relationship is observed in a human cancer cell line (K562), and we postulate that four-gene LRCs act as 'universal attenuators'. These findings suggest a role for long LRCs in dampening variation in gene expression, thereby protecting cell identity, and in controlling dramatic shifts in cell-wide gene expression through inhibiting chaos-generating motifs.</p
Identification of 2R-ohnologue gene families displaying the same mutation-load skew in multiple cancers
The complexity of signalling pathways was boosted at the origin of the vertebrates, when two rounds of whole genome duplication (2R-WGD) occurred. Those genes and proteins that have survived from the 2R-WGDâtermed 2R-ohnologuesâbelong to families of two to four members, and are enriched in signalling components relevant to cancer. Here, we find that while only approximately 30% of human transcript-coding genes are 2R-ohnologues, they carry 42â60% of the gene mutations in 30 different cancer types. Across a subset of cancer datasets, including melanoma, breast, lung adenocarcinoma, liver and medulloblastoma, we identified 673 2R-ohnologue families in which one gene carries mutations at multiple positions, while sister genes in the same family are relatively mutation free. Strikingly, in 315 of the 322 2R-ohnologue families displaying such a skew in multiple cancers, the same gene carries the heaviest mutation load in each cancer, and usually the second-ranked gene is also the same in each cancer. Our findings inspire the hypothesis that in certain cancers, heterogeneous combinations of genetic changes impair parts of the 2R-WGD signalling networks and force information flow through a limited set of oncogenic pathways in which specific non-mutated 2R-ohnologues serve as effectors. The non-mutated 2R-ohnologues are therefore potential therapeutic targets. These include proteins linked to growth factor signalling, neurotransmission and ion channels
Robust And Scalable Learning Of Complex Dataset Topologies Via Elpigraph
Large datasets represented by multidimensional data point clouds often
possess non-trivial distributions with branching trajectories and excluded
regions, with the recent single-cell transcriptomic studies of developing
embryo being notable examples. Reducing the complexity and producing compact
and interpretable representations of such data remains a challenging task. Most
of the existing computational methods are based on exploring the local data
point neighbourhood relations, a step that can perform poorly in the case of
multidimensional and noisy data. Here we present ElPiGraph, a scalable and
robust method for approximation of datasets with complex structures which does
not require computing the complete data distance matrix or the data point
neighbourhood graph. This method is able to withstand high levels of noise and
is capable of approximating complex topologies via principal graph ensembles
that can be combined into a consensus principal graph. ElPiGraph deals
efficiently with large and complex datasets in various fields from biology,
where it can be used to infer gene dynamics from single-cell RNA-Seq, to
astronomy, where it can be used to explore complex structures in the
distribution of galaxies.Comment: 32 pages, 14 figure
Inevitability and containment of replication errors for eukaryotic genome lengths spanning Megabase to Gigabase
The replication of DNA is initiated at particular sites on the genome called replication origins (ROs). Understanding the constraints that regulate the distribution of ROs across different organisms is fundamental for quantifying the degree of replication errors and their downstream consequences. Using a simple probabilistic model, we generate a set of predictions on the extreme sensitivity of error rates to the distribution of ROs, and how this distribution must therefore be tuned for genomes of vastly different sizes. As genome size changes from megabases to gigabases, we predict that regularity of RO spacing is lost, that large gaps between ROs dominate error rates but are heavily constrained by the mean stalling distance of replication forks, and that, for genomes spanning âŒ100 megabases to âŒ10 gigabases, errors become increasingly inevitable but their number remains very small (three or less). Our theory predicts that the number of errors becomes significantly higher for genome sizes greater than âŒ10 gigabases. We test these predictions against datasets in yeast, Arabidopsis, Drosophila, and human, and also through direct experimentation on two different human cell lines. Agreement of theoretical predictions with experiment and datasets is found in all cases, resulting in a picture of great simplicity, whereby the density and positioning of ROs explain the replication error rates for the entire range of eukaryotes for which data are available. The theory highlights three domains of error rates: negligible (yeast), tolerable (metazoan), and high (some plants), with the human genome at the extreme end of the middle domain
Thymic involution and rising disease incidence with age
For many cancer types, incidence rises rapidly with age as an apparent power law, supporting the idea that cancer is caused by a gradual accumulation of genetic mutations. Similarly, the incidence of many infectious diseases strongly increases with age. Here, combining data from immunology and epidemiology, we show that many of these dramatic age-related increases in incidence can be modeled based on immune system decline, rather than mutation accumulation. In humans, the thymus atrophies from infancy, resulting in an exponential decline in T cell production with a half-life of âŒ16 years, which we use as the basis for a minimal mathematical model of disease incidence. Our model outperforms the power law model with the same number of fitting parameters in describing cancer incidence data across a wide spectrum of different cancers, and provides excellent fits to infectious disease data. This framework provides mechanistic insight into cancer emergence, suggesting that age-related decline in T cell output is a major risk factor
Unreplicated DNA remaining from unperturbed S phases passes through mitosis for resolution in daughter cells
To prevent rereplication of genomic segments, the eukaryotic cell cycle is divided into two nonoverlapping phases. During late mitosis and G1 replication origins are âlicensedâ by loading MCM2-7 double hexamers and during S phase licensed replication origins activate to initiate bidirectional replication forks. Replication forks can stall irreversibly, and if two converging forks stall with no intervening licensed originâa âdouble fork stallâ (DFS)âreplication cannot be completed by conventional means. We previously showed how the distribution of replication origins in yeasts promotes complete genome replication even in the presence of irreversible fork stalling. This analysis predicts that DFSs are rare in yeasts but highly likely in large mammalian genomes. Here we show that complementary strand synthesis in early mitosis, ultrafine anaphase bridges, and G1-specific p53-binding protein 1 (53BP1) nuclear bodies provide a mechanism for resolving unreplicated DNA at DFSs in human cells. When origin number was experimentally altered, the number of these structures closely agreed with theoretical predictions of DFSs. The 53BP1 is preferentially bound to larger replicons, where the probability of DFSs is higher. Loss of 53BP1 caused hypersensitivity to licensing inhibition when replication origins were removed. These results provide a striking convergence of experimental and theoretical evidence that unreplicated DNA can pass through mitosis for resolution in the following cell cycle