    Biophysical Fitness Landscapes for Transcription Factor Binding Sites

    Evolutionary trajectories and phenotypic states available to cell populations are ultimately dictated by intermolecular interactions between DNA, RNA, proteins, and other molecular species. Here we study how evolution of gene regulation in a single-cell eukaryote S. cerevisiae is affected by the interactions between transcription factors (TFs) and their cognate genomic sites. Our study is informed by high-throughput in vitro measurements of TF-DNA binding interactions and by a comprehensive collection of genomic binding sites. Using an evolutionary model for monomorphic populations evolving on a fitness landscape, we infer fitness as a function of TF-DNA binding energy for a collection of 12 yeast TFs, and show that the shape of the predicted fitness functions is in broad agreement with a simple thermodynamic model of two-state TF-DNA binding. However, the effective temperature of the model is not always equal to the physical temperature, indicating selection pressures in addition to biophysical constraints caused by TF-DNA interactions. We find little statistical support for the fitness landscape in which each position in the binding site evolves independently, showing that epistasis is common in evolution of gene regulation. Finally, by correlating TF-DNA binding energies with biological properties of the sites or the genes they regulate, we are able to rule out several scenarios of site-specific selection, under which binding sites of the same TF would experience a spectrum of selection pressures depending on their position in the genome. These findings argue for the existence of universal fitness landscapes which shape evolution of all sites for a given TF, and whose properties are determined in part by the physics of protein-DNA interactions

    Photonic Analogue of Two-dimensional Topological Insulators and Helical One-Way Edge Transport in Bi-Anisotropic Metamaterials

    Recent progress in understanding the topological properties of condensed matter has led to the discovery of time-reversal invariant topological insulators. Because of limitations imposed by nature, topologically non-trivial electronic order seems to be uncommon except in small-band-gap semiconductors with strong spin-orbit interactions. In this Article we show that artificial electromagnetic structures, known as metamaterials, provide an attractive platform for designing photonic analogues of topological insulators. We demonstrate that a judicious choice of the metamaterial parameters can create photonic phases that support a pair of helical edge states, and that these edge states enable one-way photonic transport that is robust against disorder.Comment: 13 pages, 3 figure

    Array programming with NumPy.

    Array programming provides a powerful, compact and expressive syntax for accessing, manipulating and operating on data in vectors, matrices and higher-dimensional arrays. NumPy is the primary array programming library for the Python language. It has an essential role in research analysis pipelines in fields as diverse as physics, chemistry, astronomy, geoscience, biology, psychology, materials science, engineering, finance and economics. For example, in astronomy, NumPy was an important part of the software stack used in the discovery of gravitational waves1 and in the first imaging of a black hole2. Here we review how a few fundamental array concepts lead to a simple and powerful programming paradigm for organizing, exploring and analysing scientific data. NumPy is the foundation upon which the scientific Python ecosystem is constructed. It is so pervasive that several projects, targeting audiences with specialized needs, have developed their own NumPy-like interfaces and array objects. Owing to its central position in the ecosystem, NumPy increasingly acts as an interoperability layer between such array computation libraries and, together with its application programming interface (API), provides a flexible framework to support the next decade of scientific and industrial analysis

    Literature and Education in the Long 1930s

    Biophysical models of evolution

    The recent emergence of quantitative high-throughput experimental technology and new biophysical knowledge may finally enable significant empirical and quantitative understanding of adaptive evolution, which has been elusive for almost a century. The modern aim is to unite classical population genetics with biophysical molecular models, and to connect physical properties of biological molecules such as DNA, RNA and proteins with evolutionary parameters. In this vein, I have studied such population models theoretically, and applied one such model to yeast evolution. In Chapters 2 and 3, I will discuss “universality” in population genetics, in particular the universal applicability of a formula for the steady state distribution of phenotypes in a population evolving in the “monomorphic regime”, which describes most organisms. I show that this formula applies far outside the “weak selection” context it was originally developed in, and that it is a universal feature of evolution in this regime. Such universal features will be important components of any grand theory of adaptive evolution, and are essential for studies of real populations where the microscopic population dynamics are generally unknown. I then apply this model to a particular molecular system in yeast, Transcription Factor binding sites, which are short DNA sequences which play an important role in iigene regulation. Using the functional relationship between evolutionary fitness and the phenotypic steady state distribution, I infer the form of the selective pressure the sites experience, and find it is consistent with a simple thermodynamic model of two-state TF-DNA binding. I also show that the selection pressure a site experiences is decoupled from the selection pressure on the gene it regulates. This suggests that binding sites for a given TF evolve over a universal fitness landscape derived from simple physical interactions.Ph. D.Includes bibliographical referencesby Allan M. Haldan

    Limits to detecting epistasis in the fitness landscape of HIV.

    The rapid evolution of HIV is constrained by interactions between mutations which affect viral fitness. In this work, we explore the role of epistasis in determining the mutational fitness landscape of HIV for multiple drug target proteins, including Protease, Reverse Transcriptase, and Integrase. Epistatic interactions between residues modulate the mutation patterns involved in drug resistance, with unambiguous signatures of epistasis best seen in the comparison of the Potts model predicted and experimental HIV sequence "prevalences" expressed as higher-order marginals (beyond triplets) of the sequence probability distribution. In contrast, experimental measures of fitness such as viral replicative capacities generally probe fitness effects of point mutations in a single background, providing weak evidence for epistasis in viral systems. The detectable effects of epistasis are obscured by higher evolutionary conservation at sites. While double mutant cycles in principle, provide one of the best ways to probe epistatic interactions experimentally without reference to a particular background, we show that the analysis is complicated by the small dynamic range of measurements. Overall, we show that global pairwise interaction Potts models are necessary for predicting the mutational landscape of viral proteins

    A universal scaling law determines time reversibility and steady state of substitutions under selection

    a b s t r a c t Monomorphic loci evolve through a series of substitutions on a fitness landscape. Understanding how mutation, selection, and genetic drift drive this process, and uncovering the structure of the fitness landscape from genomic data are two major goals of evolutionary theory. Population genetics models of the substitution process have traditionally focused on the weak-selection regime, which is accurately described by diffusion theory. Predictions in this regime can be considered universal in the sense that many population models exhibit equivalent behavior in the diffusion limit. However, a growing number of experimental studies suggest that strong selection plays a key role in some systems, and thus there is a need to understand universal properties of models without a priori assumptions about selection strength. Here we study time reversibility in a general substitution model of a monomorphic haploid population. We show that for any time-reversible population model, such as the Moran process, substitution rates obey an exact scaling law. For several other irreversible models, such as the simple Wright-Fisher process and its extensions, the scaling law is accurate up to selection strengths that are well outside the diffusion regime. Time reversibility gives rise to a power-law expression for the steady-state distribution of populations on an arbitrary fitness landscape. The steady-state behavior is dominated by weak selection and is thus adequately described by the diffusion approximation, which guarantees universality of the steady-state formula and its applicability to the problem of reconstructing fitness landscapes from DNA or protein sequence data
