453 research outputs found

    Using signal processing, evolutionary computation, and machine learning to identify transposable elements in genomes

    Get PDF
    About half of the human genome consists of transposable elements (TE's), sequences that have many copies of themselves distributed throughout the genome. All genomes, from bacterial to human, contain TE's. TE's affect genome function by either creating proteins directly or affecting genome regulation. They serve as molecular fossils, giving clues to the evolutionary history of the organism. TE's are often challenging to identify because they are fragmentary or heavily mutated. In this thesis, novel features for the detection and study of TE's are developed. These features are of two types. The first type are statistical features based on the Fourier transform used to assess reading frame use. These features measure how different the reading frame use is from that of a random sequence, which reading frames the sequence is using, and the proportion of use of the active reading frames. The second type of feature, called side effect machine (SEM) features, are generated by finite state machines augmented with counters that track the number of times the state is visited. These counters then become features of the sequence. The number of possible SEM features is super-exponential in the number of states. New methods for selecting useful feature subsets that incorporate a genetic algorithm and a novel clustering method are introduced. The features produced reveal structural characteristics of the sequences of potential interest to biologists. A detailed analysis of the genetic algorithm, its fitness functions, and its fitness landscapes is performed. The features are used, together with features used in existing exon finding algorithms, to build classifiers that distinguish TE's from other genomic sequences in humans, fruit flies, and ciliates. The classifiers achieve high accuracy (> 85%) on a variety of TE classification problems. The classifiers are used to scan large genomes for TE's. In addition, the features are used to describe the TE's in the newly sequenced ciliate, Tetrahymena thermophile to provide information for biologists useful to them in forming hypotheses to test experimentally concerning the role of these TE's and the mechanisms that govern them

    KRAB zinc finger protein ZNF676 controls the transcriptional influence of LTR12-related endogenous retrovirus sequences.

    Get PDF
    BACKGROUND: Transposable element-embedded regulatory sequences (TEeRS) and their KRAB-containing zinc finger protein (KZFP) controllers are increasingly recognized as modulators of gene expression. We aim to characterize the contribution of this system to gene regulation in early human development and germ cells. RESULTS: Here, after studying genes driven by the long terminal repeat (LTR) of endogenous retroviruses, we identify the ape-restricted ZNF676 as the sequence-specific repressor of a subset of contemporary LTR12 integrants responsible for a large fraction of transpochimeric gene transcripts (TcGTs) generated during human early embryogenesis. We go on to reveal that the binding of this KZFP correlates with the epigenetic marking of these TEeRS in the germline, and is crucial to the control of genes involved in ciliogenesis/flagellogenesis, a biological process that dates back to the last common ancestor of eukaryotes. CONCLUSION: These results illustrate how KZFPs and their TE targets contribute to the evolutionary turnover of transcription networks and participate in the transgenerational inheritance of epigenetic traits

    Hiv-1 Rna Dimerization At Single Molecule Level

    Get PDF
    The Dimerization Initiation Sequence (DIS) is a conserved hairpin-loop motif on the 5\u27 UTR of the HIV-1 genome. It plays an important role in genome dimerization through formation of a kissing complex intermediate between two homologous DIS sequences. This bimolecular kissing complex ultimately leads to the formation of an extended RNA duplex. Understanding the kinetics of this interaction is key to exploiting DIS as a possible drug target against HIV. We wish to report a novel study that makes an important contribution to understanding the dimerization mechanism of HIV-1 RNA in vitro. Our work has employed single-molecule fluorescence resonance energy transfer to monitor the dimerization of minimal HIV-1 RNA sequence containing DIS. Most significantly, we observed a previously uncharacterized folding intermediate that plays a critical role in the dimerization mechanism. Our data clearly show that dimerization involves three distinct steps in dynamic equilibrium and regulated by Mg2+ ions. Two of the steps correspond to previously proposed structures: the kissing complex and the extended duplex. Surprisingly, our data reveal a previously unobserved obligatory folding intermediate, consistent with a bent kissing complex conformation, similar to the TAR complex. Mutations of the highly conserved purines flanking the DIS loop destabilize this intermediate, indicating that these purines may play an important role in the HIV-1 RNA dimerization in vivo

    Revisiting Plus-Strand DNA Synthesis in Retroviruses and Long Terminal Repeat Retrotransposons: Dynamics of Enzyme: Substrate Interactions

    Get PDF
    Although polypurine tract (PPT)-primed initiation of plus-strand DNA synthesis in retroviruses and LTR-containing retrotransposons can be accurately duplicated, the molecular details underlying this concerted series of events remain largely unknown. Importantly, the PPT 3′ terminus must be accommodated by ribonuclease H (RNase H) and DNA polymerase catalytic centers situated at either terminus of the cognate reverse transcriptase (RT), and in the case of the HIV-1 enzyme, ∼70Å apart. Communication between RT and the RNA/DNA hybrid therefore appears necessary to promote these events. The crystal structure of the HIV-1 RT/PPT complex, while informative, positions the RNase H active site several bases pairs from the PPT/U3 junction, and thus provides limited information on cleavage specificity. To fill the gap between biochemical and crystallographic approaches, we review a multidisciplinary approach combining chemical probing, mass spectrometry, NMR spectroscopy and single molecule spectroscopy. Our studies also indicate that nonnucleoside RT inhibitors affect enzyme orientation, suggesting initiation of plus-strand DNA synthesis as a potential therapeutic target

    Inhibitory effects of archetypical nucleic acid ligands on the interactions of HIV-1 nucleocapsid protein with elements of Ψ-RNA

    Get PDF
    Disrupting the interactions between human immunodeficiency virus type 1 (HIV-1) nucleocapsid (NC) protein and structural elements of the packaging signal (Ψ-RNA) could constitute an ideal strategy to inhibit the functions of this region of the genome leader in the virus life cycle. We have employed electrospray ionization (ESI) Fourier transform mass spectrometry (FTMS) to assess the ability of a series of nucleic acid ligands to bind selected structures of Ψ-RNA and inhibit their specific interactions with NC in vitro. We found that the majority of the ligands included in the study were able to form stable non-covalent complexes with stem–loop 2, 3 and 4 (SL2–4), consistent with their characteristic nucleic acid binding modes. However, only aminoglycosidic antibiotics were capable of dissociating preformed NC•SL3 and NC•SL4 complexes, but not NC•SL2. The apparent specificity of these inhibitory effects is closely dependent on distinctive structural features of the different NC•RNA complexes. The trends observed for the IC(50) values correlate very well with those provided by the ligand binding affinities and the dissociation constants of target NC•RNA complexes. This systematic investigation of archetypical nucleic acid ligands provides a valid framework to support the design of novel ligand inhibitors for HIV-1 treatment

    Proteomic Analysis of HIV-Infected Macrophages

    Get PDF
    Mononuclear phagocytes (monocytes, macrophages, and microglia) play an important role in innate immunity against pathogens including HIV. These cells are also important viral reservoirs in the central nervous system and secrete inflammatory mediators and toxins that affect the tissue environment and function of surrounding cells. In the era of antiretroviral therapy, there are fewer of these inflammatory mediators. Proteomic approaches including surface enhancement laser desorption ionization, one- and two-dimensional difference in gel electrophoresis, and liquid chromatography tandem mass spectrometry have been used to uncover the proteins produced by in vitro HIV-infected monocytes, macrophages, and microglia. These approaches have advanced the understanding of novel mechanisms for HIV replication and neuronal damage. They have also been used in tissue macrophages that restrict HIV replication to understand the mechanisms of restriction for future therapies. In this review, we summarize the proteomic studies on HIV-infected mononuclear phagocytes and discuss other recent proteomic approaches that are starting to be applied to this field. As proteomic instruments and methods evolve to become more sensitive and quantitative, future studies are likely to identify more proteins that can be targeted for diagnosis or therapy and to uncover novel disease mechanisms

    Investigating Kissing to Duplex Dimer Transition Mechanism of HIV-1 SL1 by NMR.

    Full text link
    As a common feature to retroviruses, human immunodeficiency virus (HIV) packages two identical copies of its single stranded RNA genome (gRNA) that hold together at its 5’-end. gRNA dimerization is initiated by stem loop 1 (SL1), which consists of a highly conserved asymmetric internal loop and a GC-rich self-interacting palindromic apical loop that can drive dimerization by forming a meta-stable kissing dimer. During maturation, the kissing dimer undergoes a transition catalyzed by the viral nucleocapsid protein (NCp) into a thermodynamically more stable duplex. Both dimerization and structural isomerization between the kissing and duplex dimer are critical for viral replication and packaging. While SL1 and NCp have been the focus of many studies, the mechanism of the NCp dependent SL1 dimerization and isomerization remains poorly understood. This dissertation describes the characterization of SL1 structural dynamics, its Mg2+ and NCp binding properties, and the time-course of the kissing to duplex transition using high resolution nuclear magnetic resonance (NMR) spectroscopy. Initial studies were conducted on the conformational properties of the internal loop of SL1. Subsequently, we characterized the corresponding properties in kissing and duplex SL1 dimers along with their interaction with NCp and followed site-specifically the timecourse of the kissing to duplex transition using time-resolved NMR. We observe three types of motions that may promote conversion of the kissing dimer into its duplex form: (i) diffusion-limited nanosecond collective helix motions about the G-AGG internal loop that may help bring strands from distinct monomeric units into a proper register; (ii) a secondary structural switch occurring at μs-ms timescales which leads to partial melting of the upper-helix; and (iii) looping-in-and-out hinge motions of adenine residues in the apical loop that may help bring strands from different monomers into close spatial proximity. All three classes of motions are significantly dampened by Mg2+ which likely serves to make the transition strongly dependent on NCp. The NCp interacts with the internal loop and the apical loop of kissing dimer. Our results suggest that NCp stabilizes an alternative SL1 conformation, likely involving a quadruplex geometry, prior to transitioning in a single step into the duplex dimer.Ph.D.ChemistryUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61728/1/yanxsun_1.pd

    Analysis of retroviral assembly and maturation using cryo-electron tomography

    Get PDF
    Retroviruses are a family of membrane-enveloped RNA viruses that can retrotranscribe and integrate their genome in the host cell chromatin. During the active production of virions the structural polyprotein Gag assembles together with the genomic RNA to form immature particles, which bud out from the host cell. After budding the Gag protein undergoes proteolytic maturation and is cleaved into MA, CA, NC, p6 and two spacer peptides, SP1 and SP2. This leads to dramatic changes in the core morphology and the gain of infectivity. The immature retro-virions are known to have Gag organized into a round, incomplete hexameric lattice with a spacing of ~7.5nm. The core of mature virions is organized into a mixture of hexamers and pentamers which are organized along a lattice with a spacing of ~9.6nm. The shape of the core in the mature virions is genus-dependent, but can be cylindrical, conical or round. During my PhD I have studied the immature Gag assembly across four retroviral genera in order to understand the structural requirements for the assembly of the immature retroviral lattice, and to shed more light on the principles of HIV-1 maturation. The major conclusions of my studies are the following: The CA region of Gag is the most structurally conserved across genera. The presence of a domain upstream of CA is not critical for the assembly although it stabilizes the lattice. In order to maintain an immature lattice is important to have a Gag multimerization domain downstream of CA. The region between CA and NC, which is highly variable, is not critical for the assembly but it can stabilise the lattice and therefore affect the structural changes that occur during the maturation. The maturation in retroviruses is an extremely fast process. In order to investigate the structural changes occurring during the maturation in HIV-1 I analysed the products of partial Gag maturation, which were obtained through selective mutations of the cleavage sites in Gag. This confirmed that the order in which Gag cleavages occur is important for a correct processing. The immature Gag lattice is destabilized only if both sides of the CA-SP1 region are cleaved. Furthermore, it showed that the condensation of the RNP has an effect on the core morphology in the mature virion

    Modeling Distributions of Chromosomal Modifications Using Chromosomal Features

    Get PDF
    University of Minnesota Ph.D. dissertation. February 2012. Major: Biomedical Informatics and Computational Biology. Advisors: Chad Myers, Daniel Voytas. 1 computer file (PDF); viii, 136 pages.Chromatin plays a major role in the regulation and evolution of genomic DNA. The advent of high-throughput sequencing, and the subsequently increasing availability of sequencing data from chromatin immunoprecipitation experiments, is leading to a comprehensive view of the chromatin landscape in key model organisms such as S. cerevisiae. To date, little has been done to exploit the availability of such data. My work develops a logistic regression based framework capable of dissecting the observed distribution of a particular chromosomal modification. This framework models the observed distribution in terms of other known chromosomal features in the organism. I have applied this approach to the distributions of Ty5 and Ty1 retrotransposons, identifying previously unknown integration patterns. For Ty5, I identified integration, independent of the canonical mechanism, at sites of open DNA. For Ty1, I identified precise integration events on a single surface of nucleosomes found near Polymerase III transcribed genes. Additionally, a similar logistic regression approach was developed to predict origins of replication in terms of nucleosome patterning. This resulted in a 200-fold enrichment for origin sites and over 7000-fold enrichment when ORC occupancy data was considered. Together these studies present a general model capable of utilizing the available chromosomal data to provide either mechanistic models or site predictions in a variety of organisms
    corecore