1,511 research outputs found

    Engineering biological networks using cooperative transcriptional assembly

    Full text link
    Eukaryotic genes are often regulated by multivalent transcription factor (TF) complexes. Through the process of cooperative self-assembly, these complexes carry out non-linear regulatory operations involved in cellular decision-making and signal processing. In this thesis, we apply this natural design principle to artificial networks, testing whether engineered cooperative TF assemblies can be used to program non-linear synthetic circuit behavior in yeast. Using a model-guided approach, we show that specifying strength and number of interactions in an assembly enables predictive tuning between regimes of linear and non-linear regulatory response for single- and multi-input circuits. We demonstrate that synthetic assemblies can be adjusted to control circuit dynamics, shaping the timing of activation. We harness this capability to engineer circuits that perform dynamic filtering, enabling frequency-dependent decoding in cell populations. Thru this work, we find that cooperative assembly provides a versatile way to tune nonlinearity of network connections, dramatically expanding the range engineerable behaviors available to synthetic circuits. We then extend our modeling-framework to predict genome-wide binding of our TF assemblies and find that cooperative complexes made of weakly-interacting proteins can reduce unintended activation of endogenous genes. Thus, we are able to introduce synthetic regulatory components with low fitness costs on the cell, ensuring long-term stability of our integrated circuits over time. Taken together, this dissertation outlines a synthetic framework for building cooperative transcriptional complexes in vivo in order to engineer complex regulatory behaviors that are functionally orthogonal to the host cell.2019-10-22T00:00:00

    Gene synthesis, cloning, expression, purification and biophysical characterization of the C2 domain of human tensin

    Get PDF
    Tensin is a large docking protein found in the adhesive junctions of animal cells and recruited early in the development of cell-substrate contacts. There it binds to the cytoplasmic domain of integrin β1 and caps the barbed ends of filamentous actin. This forms a rational basis for its implication in a direct role in the mechanics of membrane-cytoskeleton interactions. Tensin provides a physical link between the actin cytoskeleton, integrins, and other proteins at the cell-substrate contacts. Its overall biochemical properties are a function of its domain composition and architecture, i.e., the domains that are present and their relative positions in the molecule, and specific details of amino acid sequence and post-translational modifications. Tensin can be used as an investigative tool to help explain the physiology of cell-substrate contacts at the molecular scale. The C2 domain indicated to be present at the N-terminus of tensin, along with the phosphatase domain, bears a close homology to PTEN (Phosphatase and Tensin Homolog), a well known tumor suppressor. The main research objective was to study the structural features of the C2 domain of human tensin as well as to investigate its thermal stability. For this purpose, the C2 domain gene, after being synthesized by Polymerase Chain Reaction, was cloned into the engineered pET-14b vector that would also encode a hexa-Histidine tag. The C2 domain gene was then overexpressed in E.coli and the inclusion bodies thus formed were solubilized and lysed. Ni-NTA metal affinity chromatography was performed to obtain the purified C2 domain. Circular dichroism was used for the initial study of the structural features of the purified C2 domain. After determining that the purified C2 domain was unstable, refolding attempts were done. Later studies of the domain by circular dichroism in the far-UV and near-UV wavelengths indicated that the domain unfolded gradually with increasing amounts of the denaturant GuHCl and also that it retained some amount of tertiary structure prior to denaturation. The absence of an endothermic peak in the Differential Scanning Calorimetry experiments only suggests the need for further extensive refolding methods not attempted in this work and might also indicate the need for the presence of a phosphatase domain at its N-terminus for thermodynamic stability. The results of this work suggest the presence of a C2 domain in human tensin, which has not been documented previously. Determination of all the structural and functional characteristics of the C2 domain in the long run will contribute to the understanding of the role of tensin in tumor suppression and cell signaling

    Big data analytics in computational biology and bioinformatics

    Get PDF
    Big data analytics in computational biology and bioinformatics refers to an array of operations including biological pattern discovery, classification, prediction, inference, clustering as well as data mining in the cloud, among others. This dissertation addresses big data analytics by investigating two important operations, namely pattern discovery and network inference. The dissertation starts by focusing on biological pattern discovery at a genomic scale. Research reveals that the secondary structure in non-coding RNA (ncRNA) is more conserved during evolution than its primary nucleotide sequence. Using a covariance model approach, the stems and loops of an ncRNA secondary structure are represented as a statistical image against which an entire genome can be efficiently scanned for matching patterns. The covariance model approach is then further extended, in combination with a structural clustering algorithm and a random forests classifier, to perform genome-wide search for similarities in ncRNA tertiary structures. The dissertation then presents methods for gene network inference. Vast bodies of genomic data containing gene and protein expression patterns are now available for analysis. One challenge is to apply efficient methodologies to uncover more knowledge about the cellular functions. Very little is known concerning how genes regulate cellular activities. A gene regulatory network (GRN) can be represented by a directed graph in which each node is a gene and each edge or link is a regulatory effect that one gene has on another gene. By evaluating gene expression patterns, researchers perform in silico data analyses in systems biology, in particular GRN inference, where the “reverse engineering” is involved in predicting how a system works by looking at the system output alone. Many algorithmic and statistical approaches have been developed to computationally reverse engineer biological systems. However, there are no known bioin-formatics tools capable of performing perfect GRN inference. Here, extensive experiments are conducted to evaluate and compare recent bioinformatics tools for inferring GRNs from time-series gene expression data. Standard performance metrics for these tools based on both simulated and real data sets are generally low, suggesting that further efforts are needed to develop more reliable GRN inference tools. It is also observed that using multiple tools together can help identify true regulatory interactions between genes, a finding consistent with those reported in the literature. Finally, the dissertation discusses and presents a framework for parallelizing GRN inference methods using Apache Hadoop in a cloud environment

    Functional Identification and Characterization of cis-Regulatory Elements

    Get PDF
    Transcription is regulated through interactions between regulatory proteins, such as transcription factors (TFs), and DNA sequence. It is known that TFs act combinatorially in some cases to regulate transcription, but in which situations and to what degree is unclear. I first studied the contribution of TF binding sites to expression in mouse embryonic stem (ES) cells by using synthetic cis-regulatory elements (CREs). The synthetic CREs were comprised of combinations of binding sites for the pluripotency TFs Oct4, Sox2, Klf4, and Esrrb. A statistical thermodynamic model explained 72% of the variation in expression driven by these CREs. The high predictive power of this model depended on five TF interaction parameters, including favorable heterotypic interactions between Oct4 and Sox2, Klf4 and Sox2, and Klf4 and Esrrb. The model also included two unfavorable homotypic interaction parameters. These homotypic parameters help to explain the fact that synthetic CREs with mixtures of binding sites for various TFs drive much higher expression than multiple binding sites for the same TF. I then found that the expression of these synthetic CREs largely changes as ES cells differentiate down the neural lineage. However, CREs with no repeat binding sites drove similar levels of expression, suggesting that heterotypic interactions may be similar in the two conditions. In a separate set of experiments I interrogated the determinants of expression driven by genomic sequences previously segmented into classes based on chromatin features. A set of these sequences was assayed in K562 cells. As expected, we found that Enhancers and Weak Enhancers drove expression over background, while Repressed elements and Enhancers from another cell type did not. Unexpectedly, we found that Weak Enhancers drove higher expression than Enhancers, possibly based on their lower H3K36me3 and H3K27ac, which we found to be weakly associated with lower expression. Using a logistic regression model, we showed that matches to TF binding motifs were best able to predict active sequences, but chromatin features contributed significantly as well. These results demonstrate that interactions between certain combinations of pluripotency TFs, but not all combinations, are important to transcriptional regulation. Furthermore, chromatin modifications can still contribute to predictions of expression even after accounting for binding site motifs. Better understanding of the process of cis-regulation will allow us to predict which sequences can drive expression and how perturbations affect this expression

    Mouse cytoplasmic dynein intermediate chains: identification of new isoforms, alternative splicing and tissue distribution of transcripts

    Get PDF
    BACKGROUND: Intracellular transport of cargoes including organelles, vesicles, signalling molecules, protein complexes, and RNAs, is essential for normal function of eukaryotic cells. The cytoplasmic dynein complex is an important motor that moves cargos along microtubule tracks within the cell. In mammals this multiprotein complex includes dynein intermediate chains 1 and 2 which are encoded by two genes, Dync1i1 and Dync1i2. These proteins are involved in dynein cargo binding and dynein complexes with different intermediate chains bind to specific cargoes, although the mechanisms to achieve this are not known. The DYNC1I1 and DYNC1I2 proteins are translated from different splice isoforms, and specific forms of each protein are essential for the function of different dynein complexes in neurons. METHODOLOGY/PRINCIPAL FINDINGS: Here we have undertaken a systematic survey of the dynein intermediate chain splice isoforms in mouse, basing our study on mRNA expression patterns in a range of tissues, and on bioinformatics analysis of mouse, rat and human genomic and cDNA sequences. We found a complex pattern of alternative splicing of both dynein intermediate chain genes, with maximum complexity in the embryonic and adult nervous system. We have found novel transcripts, including some with orthologues in human and rat, and a new promoter and alternative non-coding exon 1 for Dync1i2. CONCLUSIONS/SIGNIFICANCE: These data, including the cloned isoforms will be essential for understanding the role of intermediate chains in the cytoplasmic dynein complex, particularly their role in cargo binding within individual tissues including different brain regions

    The role of rare codons in protein expression

    Get PDF
    That the flow of information from gene sequence to protein sequence depends on the translation of a code that could literally be described as digital is a truly incredible feat of nature. However, the process of translation is a noisy, stochastic, kinetic process that depends on many factors. The redundancy in the genetic code allows the transmission of additional, analogue information by varying some of these factors. How organisms use the redundancy is termed codon usage, and rare codons are those that are typically shunned in favour of other synonymous options. Synonymous variations to the codon usage pattern of a gene have been linked to disease, and can have huge effects on the functionality and quantity of protein produced from a gene, but the nature of these variations is complex and poorly understood. In some cases, rare codons appear to have a beneficial influence on expression. This thesis investigates the phenomenon of rare codons and attempts to elucidate their evolutionary role in optimal gene expression. It begins with the design of a novel statistical algorithm, which is used to generate a dataset of interesting genetic locations. The dataset is the subject of a hypothesis-driven investigation to discover meaningful biological correlates, and this is complemented by experimental work, to attempt to provide conclusive validation of the approach

    Women in Science 2012

    Get PDF
    The summer of 2012 saw the number of students seeking summer research experiences with a faculty mentor reaching record levels. In total, 179 students participated in the Summer Undergraduate Research Fellows (SURF) program, involving 59 faculty mentor-advisors, representing all of the Clark Science Center’s fourteen departments and programs.https://scholarworks.smith.edu/clark_womeninscience/1011/thumbnail.jp
    • …
    corecore