2,831 research outputs found

    Real-Time Dynamics of RNA Polymerase II Clustering in Live Human Cells

    Get PDF
    Transcription is reported to be spatially compartmentalized in nuclear transcription factories with clusters of RNA polymerase II (Pol II). However, little is known about when these foci assemble or their relative stability. We developed a quantitative single-cell approach to characterize protein spatiotemporal organization, with single-molecule sensitivity in live eukaryotic cells. We observed that Pol II clusters form transiently, with an average lifetime of 5.1 (± 0.4) seconds, which refutes the notion that they are statically assembled substructures. Stimuli affecting transcription yielded orders-of-magnitude changes in the dynamics of Pol II clusters, which implies that clustering is regulated and plays a role in the cell’s ability to effect rapid response to external signals. Our results suggest that transient crowding of enzymes may aid in rate-limiting steps of gene regulation

    A Common Class of Transcripts with 5\u27-Intron Depletion, Distinct Early Coding Sequence Features, and N1-Methyladenosine Modification [preprint]

    Get PDF
    Introns are found in 5\u27 untranslated regions (5\u27UTRs) for 35% of all human transcripts. These 5\u27UTR introns are not randomly distributed: genes that encode secreted, membrane-bound and mitochondrial proteins are less likely to have them. Curiously, transcripts lacking 5\u27UTR introns tend to harbor specific RNA sequence elements in their early coding regions. To model and understand the connection between coding-region sequence and 5\u27UTR intron status, we developed a classifier that can predict 5\u27UTR intron status with \u3e80% accuracy using only sequence features in the early coding region. Thus, the classifier identifies transcripts with 5\u27 proximal-intron-minus-like-coding regions ( 5IM transcripts). Unexpectedly, we found that the early coding sequence features defining 5IM transcripts are widespread, appearing in 21% of all human RefSeq transcripts. The 5IM class of transcripts is enriched for non-AUG start codons, more extensive secondary structure both preceding the start codon and near the 5\u27 cap, greater dependence on eIF4E for translation, and association with ER-proximal ribosomes. 5IM transcripts are bound by the Exon Junction Complex (EJC) at non-canonical 5\u27 proximal positions. Finally, N1-methyladenosines are specifically enriched in the early coding regions of 5IM transcripts. Taken together, our analyses point to the existence of a distinct 5IM class comprising ~20% of human transcripts. This class is defined by depletion of 5\u27 proximal introns, presence of specific RNA sequence features associated with low translation efficiency, N1-methyladenosines in the early coding region, and enrichment for non-canonical binding by the Exon Junction Complex

    Novel computational methods for studying the role and interactions of transcription factors in gene regulation

    Get PDF
    Regulation of which genes are expressed and when enables the existence of different cell types sharing the same genetic code in their DNA. Erroneously functioning gene regulation can lead to diseases such as cancer. Gene regulatory programs can malfunction in several ways. Often if a disease is caused by a defective protein, the cause is a mutation in the gene coding for the protein rendering the protein unable to perform its functions properly. However, protein-coding genes make up only about 1.5% of the human genome, and majority of all disease-associated mutations discovered reside outside protein-coding genes. The mechanisms of action of these non-coding disease-associated mutations are far more incompletely understood. Binding of transcription factors (TFs) to DNA controls the rate of transcribing genetic information from the coding DNA sequence to RNA. Binding affinities of TFs to DNA have been extensively measured in vitro, ligands by exponential enrichment) and Protein Binding Microarrays (PBMs), and the genome-wide binding locations and patterns of TFs have been mapped in dozens of cell types. Despite this, our understanding of how TF binding to regulatory regions of the genome, promoters and enhancers, leads to gene expression is not at the level where gene expression could be reliably predicted based on DNA sequence only. In this work, we develop and apply computational tools to analyze and model the effects of TF-DNA binding. We also develop new methods for interpreting and understanding deep learning-based models trained on biological sequence data. In biological applications, the ability to understand how machine learning models make predictions is as, or even more important as raw predictive performance. This has created a demand for approaches helping researchers extract biologically meaningful information from deep learning model predictions. We develop a novel computational method for determining TF binding sites genome-wide from recently developed high-resolution ChIP-exo and ChIP-nexus experiments. We demonstrate that our method performs similarly or better than previously published methods while making less assumptions about the data. We also describe an improved algorithm for calling allele-specific TF-DNA binding. We utilize deep learning methods to learn features predicting transcriptional activity of human promoters and enhancers. The deep learning models are trained on massively parallel reporter gene assay (MPRA) data from human genomic regulatory elements, designed regulatory elements and promoters and enhancers selected from totally random pool of synthetic input DNA. This unprecedentedly large set of measurements of human gene regulatory element activities, in total more than 100 times the size of the human genome, allowed us to train models that were able to predict genomic transcription start site positions more accurately than models trained on genomic promoters, and to correctly predict effects of disease-associated promoter variants. We also found that interactions between promoters and local classical enhancers are non-specific in nature. The MPRA data integrated with extensive epigenetic measurements supports existence of three different classes of enhancers: classical enhancers, closed chromatin enhancers and chromatin-dependent enhancers. We also show that TFs can be divided into four different, non-exclusive classes based on their activities: chromatin opening, enhancing, promoting and TSS determining TFs. Interpreting the deep learning models of human gene regulatory elements required application of several existing model interpretation tools as well as developing new approaches. Here, we describe two new methods for visualizing features and interactions learned by deep learning models. Firstly, we describe an algorithm for testing if a deep learning model has learned an existing binding motif of a TF. Secondly, we visualize mutual information between pairwise k-mer distributions in sample inputs selected according to predictions by a machine learning model. This method highlights pairwise, and positional dependencies learned by a machine learning model. We demonstrate the use of this model-agnostic approach with classification and regression models trained on DNA, RNA and amino acid sequences.Monet eliöt koostuvat useista erilaisista solutyypeistä, vaikka kaikissa näiden eliöiden soluissa onkin sama DNA-koodi. Geenien ilmentymisen säätely mahdollistaa erilaiset solutyypit. Virheellisesti toimiva säätely voi johtaa sairauksiin, esimerkiksi syövän puhkeamiseen. Jos sairauden aiheuttaa viallinen proteiini, on syynä usein mutaatio tätä proteiinia koodaavassa geenissä, joka muuttaa proteiinia siten, ettei se enää pysty toimittamaan tehtäväänsä riittävän hyvin. Kuitenkin vain 1,5 % ihmisen genomista on proteiineja koodaavia geenejä. Suurin osa kaikista löydetyistä sairauksiin liitetyistä mutaatioista sijaitsee näiden ns. koodaavien alueiden ulkopuolella. Ei-koodaavien sairauksiin liitetyiden mutaatioiden vaikutusmekanismit ovat yleisesti paljon huonommin tunnettuja, kuin koodaavien alueiden mutaatioiden. Transkriptiotekijöiden sitoutuminen DNA:han säätelee transkriptiota, eli geeneissä olevan geneettisen informaation lukemista ja muuntamista RNA:ksi. Transkriptiotekijöiden sitoutumista DNA:han on mitattu kattavasti in vitro-olosuhteissa, ja monien transkriptiotekijöiden sitoutumiskohdat on mitattu genominlaajuisesti useissa eri solutyypeissä. Tästä huolimatta ymmärryksemme siitä miten transkriptioitekijöiden sitoutuminen genomin säätelyelementteihin, eli promoottoreihin ja vahvistajiin, johtaa geenien ilmentymiseen ei ole sellaisella tasolla, että voisimme luotettavasti ennustaa geenien ilmentymistä pelkästään DNA-sekvenssin perusteella. Tässä työssä kehitämme ja sovellamme laskennallisia työkaluja transkriptiotekijöiden sitoutumisesta johtuvan geenien ilmentymisen analysointiin ja mallintamiseen. Kehitämme myös uusia menetelmiä biologisella sekvenssidatalla opetettujen syväoppimismallien tulkitsemiseksi. Koneoppimismallin tekemien ennusteiden ymmärrettävyys on biologisissa sovelluksissa yleensä yhtä tärkeää, ellei jopa tärkeämpää kuin pelkkä raaka ennustetarkkuus. Tämä on synnyttänyt tarpeen uusille menetelmille, jotka auttavat tutkijoita louhimaan biologisesti merkityksellistä tietoa syväoppimismallien ennusteista. Kehitimme tässä työssä uuden laskennallisen työkalun, jolla voidaan määrittää transkriptiotekijöiden sitoutumiskohdat genominlaajuisesti käyttäen mittausdataa hiljattain kehitetyistä korkearesoluutioisista ChIP-exo ja ChIP-nexus kokeista. Näytämme, että kehittämämme menetelmä suoriutuu paremmin, tai vähintään yhtä hyvin kuin aiemmin julkaistut menetelmät tehden näitä vähemmän oletuksia signaalin muodosta. Esittelemme myös parannellun algoritmin transkriptiotekijöiden alleelispesifin sitoutumisen määrittämiseksi. Käytämme syväoppimismenetelmiä oppimaan mitkä ominaisuudet ennustavat ihmisen promoottori- ja voimistajaelementtien aktiivisuutta. Nämä syväoppimismallit on opetettu valtavien rinnakkaisten reportterigeenikokeiden datalla ihmisen genomisista säätelyelementeistä, sekä aktiivisista promoottoreista ja voimistajista, jotka ovat valikoituneet satunnaisesta joukosta synteettisiä DNA-sekvenssejä. Tämä ennennäkemättömän laaja joukko mittauksia ihmisen säätelyelementtien aktiivisuudesta - yli satakertainen määrä DNA sekvenssiä ihmisen genomiin verrattuna - mahdollisti transkription aloituskohtien sijainnin ennustamisen ihmisen genomissa tarkemmin kuin ihmisen genomilla opetetut mallit. Nämä mallit myös ennustivat oikein sairauksiin liitettyjen mutaatioiden vaikutukset ihmisen promoottoreilla. Tuloksemme näyttivät, että vuorovaikutukset ihmisen promoottorien ja klassisten paikallisten voimistajien välillä ovat epäspesifejä. MPRA-data, integroituna kattavien epigeneettisten mittausten kanssa mahdollisti voimistajaelementtien jaon kolmeen luokkaan: klassiset, suljetun kromatiinin, ja kromatiinista riippuvat voimistajat. Tutkimuksemme osoitti, että transkriptiotekijät voidaan jakaa neljään, osittain päällekkäiseen luokkaan niiden aktiivisuuksien perusteella: kromatiinia avaaviin, voimistaviin, promotoiviin ja transkription aloituskohdan määrittäviin transkriptiotekijöihin. Ihmisen genomin säätelyelementtejä kuvaavien syväoppimismallien tulkitseminen vaati sekä olemassa olevien menetelmien soveltamista, että uusien kehittämistä. Kehitimme tässä työssä kaksi uutta menetelmää syväoppimismallien oppimien muuttujien ja niiden välisten vuorovaikutusten visualisoimiseksi. Ensin esittelemme algoritmin, jonka avulla voidaan testata onko syväoppimismalli oppinut jonkin jo tunnetun transkriptiotekijän sitoutumishahmon. Toiseksi, visualisoimme positiokohtaisten k-meerijakaumien keskeisinformaatiota sekvensseissä, jotka on valittu syväoppimismallin ennusteiden perusteella. Tämä menetelmä paljastaa syväoppimismallin oppimat parivuorovaikutukset ja positiokohtaiset riippuvuudet. Näytämme, että kehittämämme menetelmä on mallin arkkitehtuurista riippumaton soveltamalla sitä sekä luokittelijoihin, että regressiomalleihin, jotka on opetettu joko DNA-, RNA-, tai aminohapposekvenssidatalla

    Exocyst mutants suppress pollen tube growth and cell wall structural defects of hydroxyproline O‐arabinosyltransferase mutants

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/1/tpj14808-sup-0003-FigS3.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/9/tpj14808.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/8/tpj14808-sup-0001-FigS1.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/7/tpj14808-sup-0004-FigS4.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/6/tpj14808-sup-0005-FigS5.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/5/tpj14808-sup-0007-FigS7.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/4/tpj14808-sup-0006-FigS6.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/3/tpj14808-sup-0002-FigS2.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156472/2/tpj14808_am.pd

    Development of a CD22-specific chimeric antigen receptor (CAR) for the adoptive T cell therapy of leukemia and lymphoma

    Get PDF
    Ex vivo engineering of patient T cells for the specific redirection toward cancer cells is a promising immunotherapeutic strategy to treat hematological malignancies. In this doctoral thesis, a novel CD22 specific chimeric antigen receptor (CAR) was generated for the adoptive T cell therapy of CD22 positive leukemia and lymphoma. The humanized anti-CD22 (hCD22) single chain variable fragment (scFv) was used as antigen binding domain of a third generation CAR, comprising the signal transduction domains CD3ζ, CD28 and 4-1BB. Due to its high affinity and biophysical stability, the hCD22 scFv was compared to the murine anti-CD22 antibody fragment (mCD22) in terms of scFv and CAR stability. Furthermore, to enhance clinical CAR efficacy and CAR safety, the hCD22 CAR was optimized by mutagenesis. Stability experiments revealed that the mCD22 scFv has a high stability in human serum, comparable to its derived hCD22 scFv. CD22 specific activation of T cells expressing the corresponding hCD22 or mCD22 CAR proved biophysical stability of both scFv derived CARs. By mutating the hCD22 CAR human Fc spacer domain (ΔFc), binding to human Fc receptor expressing cells was blocked, thus reducing on-target, off-tumor CAR related toxicity. The blocking of interleukin-2 (IL-2) secretion caused by the LCK mutation introduced in the hCD22 CAR CD28 signaling domain (ΔCD28) needs to be further investigated as absence of IL-2 release was observed for both the parental and the mutated hCD22 CAR variants. Specific CAR T cell activation was observed for the parental, the ΔFc, the ΔCD28 and the double mutated ΔFc-ΔCD28 hCD22 CAR confirming that both introduced mutations did not affect CAR efficacy in vitro. However, the ΔFc-ΔCD28 hCD22 CAR exhibited a slightly lower anti-tumor efficacy in comparison to the ΔFc, the ΔCD28 and the parental hCD22 CAR. By additionally engineering the hCD22 scFv to further improve the stability of the derived ΔFc-ΔCD28 CAR, CAR T cell activation was not enhanced. This doctoral thesis provides the basis for the clinical development of a novel CD22 CAR T cell therapy for the treatment of CD22 positive leukemia and lymphoma

    Cell-specific Gene Expression: Pylorus Morphogenesis and Hedgehog-regulated Enhancers.

    Full text link
    The precise spatiotemporal control of gene expression is integral to the survival of all organisms. Inappropriate gene expression can lead to developmental defects in newborns, such as Infantile Hypertrophic Pyloric Stenosis, in which hypertrophy of pyloric sphincter smooth muscle leads to gastric outlet obstruction. This thesis work analyzes the mechanisms and consequences of cell-specific gene expression in three systems: establishment of the epithelial gastro-duodenal (pyloric) border, development of smooth muscle structures at the pylorus, and transcriptional response to Hedgehog (Hh) signaling in Drosophila. Microarray is used to characterize the antral, pyloric, and duodenal transcriptomes at embryonic days (E) 14.5 and 16.5. At E16.5, hundreds of genes are upregulated specifically in duodenal epithelium. This event is termed intestinalization because the activated genes are associated with intestinal function. Several transcription factors (i.e., Tcfec, Creb3l3, and Hnf4gamma) are upregulated in duodenal epithelium and levels of Hh signaling are downregulated in duodenal mesenchyme. In addition, novel pyloric genes are identified, including Gata3, which encodes a zinc finger transcription factor. A role for Gata3 during pylorus development is elucidated using a genetic model of Gata3 insufficiency. Gata3 and the homeodomain transcription factor Nkx2-5 co-localize with molecular markers of pyloric smooth muscle and are expressed in novel bilateral smooth muscle structures at the pylorus (i.e., the ventral pyloric cords). Loss of Gata3 alters the shape of the pylorus and attenuates the pyloric constriction. The ventral pyloric cords and outer longitudinal smooth muscle at the pylorus are absent in Gata3 null embryos. Gata3 does not control Nkx2-5 expression at the pylorus. An in silico approach identifies Hh-regulated enhancers in Drosophila. Binding sites for the Hh transcriptional effector cubitus interruptus (Ci) are significantly clustered in the genomes of two divergent Drosophila species, but mutant Ci sites are not. Putative Hh-regulated enhancers are identified by the comparison of orthologous regions of significant Ci clustering. Two of these enhancers (inv and rdx) are active in Hh-responsive cells of the Drosophila larval imaginal wing disc. These studies reveal novel gene expression patterns during pylorus morphogenesis and suggest an approach to identifying direct transcriptional targets of signaling pathways.Ph.D.Cell and Developmental BiologyUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/91524/1/udager_1.pd

    Functional Similarity of PRD-Containing Virulence Regulators in Bacillus anthracis

    Get PDF
    Bacillus anthracis produces three regulators, AtxA, AcpA, and AcpB, that control virulence gene expression and are members of an emerging class of regulators termed “PCVRs” (Phosphoenolpyruvate-dependent phosphotransferase regulation Domain-Containing Virulence Regulators). AtxA controls expression of the toxin genes; lef, cya, and pag, and is the master virulence regulator and archetype PCVR. AcpA and AcpB are less well studied. AcpA and AcpB independently positively control transcription of the capsule biosynthetic operon capBCADE, and culture conditions that enhance AtxA activity result in capBCADE transcription in strains lacking acpA and acpB. RNA-Seq was used to assess the regulons of the paralogs in strains producing individual PCVRs at native levels. Plasmid- and chromosome-borne genes were PCVR-controlled, with AtxA, AcpA, and AcpB having a ≥4-fold effect on transcript levels of 145, 130, and 49 genes respectively. Several genes were co-regulated by two or three PCVRs. Results from transcriptional reporters of PCVR-regulated promoters fused to promoterless lacZ genes largely mirrored RNA-Seq data showing AtxA alone had activity on Plef-lacZ, and AcpA and AcpB had more activity than AtxA on PcapB-lacZ. Studies to test the effect of AtxA levels on virulence and sporulation used atxA mutants. A mutant that overexpressed atxA and exhibited elevated AtxA and toxin levels in vitro, was not increased for virulence in a murine anthrax infection model. AtxA levels also affected sporulation efficiency. Culture of B. anthracis in medium containing bicarbonate and elevated carbon dioxide increased PCVR activity compared to culture in ambient air in medium lacking bicarbonate. However, neither the solubility nor stability of the regulators was affected by carbon dioxide concentration. AcpA and AcpB form homomultimers and multimerization was dependent on the EIIB-like domains, as shown previously for AtxA. Heteromultimers of AtxA-AcpA were detected and in co-expression experiments, AcpA activity was reduced by increased levels of AtxA. An AtxA orthologue in Bacillus cereus, AtxA2, had less activity than AtxA from B. anthracis potentially due to reduced dimer formation. The results provided in this dissertation increase our knowledge of virulence gene expression in B. anthracis, while advancing our understanding of this newly-discovered class of transcriptional regulators

    ER-targeted Intrabodies Mediating Specific In Vivo Knockdown of Transitory Proteins in Comparison to RNAi

    Get PDF
    In animals and mammalian cells, protein function can be analyzed by nucleotide sequence-based methods such as gene knockout, targeted gene disruption, CRISPR/Cas, TALEN, zinc finger nucleases, or the RNAi technique. Alternatively, protein knockdown approaches are available based on direct interference of the target protein with the inhibitor

    Fast protein superfamily classification using principal component null space analysis.

    Get PDF
    The protein family classification problem, which consists of determining the family memberships of given unknown protein sequences, is very important for a biologist for many practical reasons, such as drug discovery, prediction of molecular functions and medical diagnosis. Neural networks and Bayesian methods have performed well on the protein classification problem, achieving accuracy ranging from 90% to 98% while running relatively slowly in the learning stage. In this thesis, we present a principal component null space analysis (PCNSA) linear classifier to the problem and report excellent results compared to those of neural networks and support vector machines. The two main parameters of PCNSA are linked to the high dimensionality of the dataset used, and were optimized in an exhaustive manner to maximize accuracy. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .F74. Source: Masters Abstracts International, Volume: 44-03, page: 1400. Thesis (M.Sc.)--University of Windsor (Canada), 2005
    corecore