112 research outputs found

    Telomere disruption results in non-random formation of de novo dicentric chromosomes involving acrocentric human chromosomes

    Get PDF
    Copyright: © 2010 Stimpson et al.Genome rearrangement often produces chromosomes with two centromeres (dicentrics) that are inherently unstable because of bridge formation and breakage during cell division. However, mammalian dicentrics, and particularly those in humans, can be quite stable, usually because one centromere is functionally silenced. Molecular mechanisms of centromere inactivation are poorly understood since there are few systems to experimentally create dicentric human chromosomes. Here, we describe a human cell culture model that enriches for de novo dicentrics. We demonstrate that transient disruption of human telomere structure non-randomly produces dicentric fusions involving acrocentric chromosomes. The induced dicentrics vary in structure near fusion breakpoints and like naturally-occurring dicentrics, exhibit various inter-centromeric distances. Many functional dicentrics persist for months after formation. Even those with distantly spaced centromeres remain functionally dicentric for 20 cell generations. Other dicentrics within the population reflect centromere inactivation. In some cases, centromere inactivation occurs by an apparently epigenetic mechanism. In other dicentrics, the size of the alpha-satellite DNA array associated with CENP-A is reduced compared to the same array before dicentric formation. Extrachromosomal fragments that contained CENP-A often appear in the same cells as dicentrics. Some of these fragments are derived from the same alpha-satellite DNA array as inactivated centromeres. Our results indicate that dicentric human chromosomes undergo alternative fates after formation. Many retain two active centromeres and are stable through multiple cell divisions. Others undergo centromere inactivation. This event occurs within a broad temporal window and can involve deletion of chromatin that marks the locus as a site for CENP-A maintenance/replenishment.This work was supported by the Tumorzentrum Heidelberg/Mannheim grant (D.10026941)and by March of Dimes Research Foundation grant #1-FY06-377 and NIH R01 GM069514

    A mathematical and computational review of Hartree-Fock SCF methods in Quantum Chemistry

    Get PDF
    We present here a review of the fundamental topics of Hartree-Fock theory in Quantum Chemistry. From the molecular Hamiltonian, using and discussing the Born-Oppenheimer approximation, we arrive to the Hartree and Hartree-Fock equations for the electronic problem. Special emphasis is placed in the most relevant mathematical aspects of the theoretical derivation of the final equations, as well as in the results regarding the existence and uniqueness of their solutions. All Hartree-Fock versions with different spin restrictions are systematically extracted from the general case, thus providing a unifying framework. Then, the discretization of the one-electron orbitals space is reviewed and the Roothaan-Hall formalism introduced. This leads to a exposition of the basic underlying concepts related to the construction and selection of Gaussian basis sets, focusing in algorithmic efficiency issues. Finally, we close the review with a section in which the most relevant modern developments (specially those related to the design of linear-scaling methods) are commented and linked to the issues discussed. The whole work is intentionally introductory and rather self-contained, so that it may be useful for non experts that aim to use quantum chemical methods in interdisciplinary applications. Moreover, much material that is found scattered in the literature has been put together here to facilitate comprehension and to serve as a handy reference.Comment: 64 pages, 3 figures, tMPH2e.cls style file, doublesp, mathbbol and subeqn package

    Protecting a transgene expression from the HAC-based vector by different chromatin insulators

    Get PDF
    Human artificial chromosomes (HACs) are vectors that offer advantages of capacity and stability for gene delivery and expression. Several studies have even demonstrated their use for gene complementation in gene-deficient recipient cell lines and animal transgenesis. Recently, we constructed an advance HAC-based vector, alphoid(tetO)-HAC, with a conditional centromere. In this HAC, a gene-loading site was inserted into a centrochromatin domain critical for kinetochore assembly and maintenance. While by definition this domain is permissive for transcription, there have been no long-term studies on transgene expression within centrochromatin. In this study, we compared the effects of three chromatin insulators, cHS4, gamma-satellite DNA, and tDNA, on the expression of an EGFP transgene inserted into the alphoid(tetO)-HAC vector. Insulator function was essential for stable expression of the transgene in centrochromatin. In two analyzed host cell lines, a tDNA insulator composed of two functional copies of tRNA genes showed the highest barrier activity. We infer that proximity to centrochromatin does not protect genes lacking chromatin insulators from epigenetic silencing. Barrier elements that prevent gene silencing in centrochromatin would thus help to optimize transgenesis using HAC vectors. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00018-013-1362-9) contains supplementary material, which is available to authorized users

    HemeBIND: a novel method for heme binding residue prediction by combining structural and sequence information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Accurate prediction of binding residues involved in the interactions between proteins and small ligands is one of the major challenges in structural bioinformatics. Heme is an essential and commonly used ligand that plays critical roles in electron transfer, catalysis, signal transduction and gene expression. Although much effort has been devoted to the development of various generic algorithms for ligand binding site prediction over the last decade, no algorithm has been specifically designed to complement experimental techniques for identification of heme binding residues. Consequently, an urgent need is to develop a computational method for recognizing these important residues.</p> <p>Results</p> <p>Here we introduced an efficient algorithm HemeBIND for predicting heme binding residues by integrating structural and sequence information. We systematically investigated the characteristics of binding interfaces based on a non-redundant dataset of heme-protein complexes. It was found that several sequence and structural attributes such as evolutionary conservation, solvent accessibility, depth and protrusion clearly illustrate the differences between heme binding and non-binding residues. These features can then be separately used or combined to build the structure-based classifiers using support vector machine (SVM). The results showed that the information contained in these features is largely complementary and their combination achieved the best performance. To further improve the performance, an attempt has been made to develop a post-processing procedure to reduce the number of false positives. In addition, we built a sequence-based classifier based on SVM and sequence profile as an alternative when only sequence information can be used. Finally, we employed a voting method to combine the outputs of structure-based and sequence-based classifiers, which demonstrated remarkably better performance than the individual classifier alone.</p> <p>Conclusions</p> <p>HemeBIND is the first specialized algorithm used to predict binding residues in protein structures for heme ligands. Extensive experiments indicated that both the structure-based and sequence-based methods have effectively identified heme binding residues while the complementary relationship between them can result in a significant improvement in prediction performance. The value of our method is highlighted through the development of HemeBIND web server that is freely accessible at <url>http://mleg.cse.sc.edu/hemeBIND/</url>.</p

    Volume-based solvation models out-perform area-based models in combined studies of wild-type and mutated protein-protein interfaces

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Empirical binding models have previously been investigated for the energetics of protein complexation (ΔG models) and for the influence of mutations on complexation (i.e. differences between wild-type and mutant complexes, ΔΔG models). We construct binding models to directly compare these processes, which have generally been studied separately.</p> <p>Results</p> <p>Although reasonable fit models were found for both ΔG and ΔΔG cases, they differ substantially. In a dataset curated for the absence of mainchain rearrangement upon binding, non-polar area burial is a major determinant of ΔG models. However this ΔG model does not fit well to the data for binding differences upon mutation. Burial of non-polar area is weighted down in fitting of ΔΔG models. These calculations were made with no repacking of sidechains upon complexation, and only minimal packing upon mutation. We investigated the consequences of more extensive packing changes with a modified mean-field packing scheme. Rather than emphasising solvent exposure with relatively extended sidechains, rotamers are selected that exhibit maximal packing with protein. This provides solvent accessible areas for proteins that are much closer to those of experimental structures than the more extended sidechain regime. The new packing scheme increases changes in non-polar burial for mutants compared to wild-type proteins, but does not substantially improve agreement between ΔG and ΔΔG binding models.</p> <p>Conclusion</p> <p>We conclude that solvent accessible area, based on modelled mutant structures, is a poor correlate for ΔΔG upon mutation. A simple volume-based, rather than solvent accessibility-based, model is constructed for ΔG and ΔΔG systems. This shows a more consistent behaviour. We discuss the efficacy of volume, as opposed to area, approaches to describe the energetic consequences of mutations at interfaces. This knowledge can be used to develop simple computational screens for binding in comparative modelled interfaces.</p

    Large Tandem, Higher Order Repeats and Regularly Dispersed Repeat Units Contribute Substantially to Divergence Between Human and Chimpanzee Y Chromosomes

    Get PDF
    Comparison of human and chimpanzee genomes has received much attention, because of paramount role for understanding evolutionary step distinguishing us from our closest living relative. In order to contribute to insight into Y chromosome evolutionary history, we study and compare tandems, higher order repeats (HORs), and regularly dispersed repeats in human and chimpanzee Y chromosome contigs, using robust Global Repeat Map algorithm. We find a new type of long-range acceleration, human-accelerated HOR regions. In peripheral domains of 35mer human alphoid HORs, we find riddled features with ten additional repeat monomers. In chimpanzee, we identify 30mer alphoid HOR. We construct alphoid HOR schemes showing significant human-chimpanzee difference, revealing rapid evolution after human-chimpanzee separation. We identify and analyze over 20 large repeat units, most of them reported here for the first time as: chimpanzee and human ~1.6 kb 3mer secondary repeat unit (SRU) and ~23.5 kb tertiary repeat unit (~0.55 kb primary repeat unit, PRU); human 10848, 15775, 20309, 60910, and 72140 bp PRUs; human 3mer SRU (~2.4 kb PRU); 715mer and 1123mer SRUs (5mer PRU); chimpanzee 5096, 10762, 10853, 60523 bp PRUs; and chimpanzee 64624 bp SRU (10853 bp PRU). We show that substantial human-chimpanzee differences are concentrated in large repeat structures, at the level of as much as ~70% divergence, sizably exceeding previous numerical estimates for some selected noncoding sequences. Smeared over the whole sequenced assembly (25 Mb) this gives ~14% human--chimpanzee divergence. This is significantly higher estimate of divergence between human and chimpanzee than previous estimates.Comment: 22 pages, 7 figures, 12 tables. Published in Journal of Molecular Evolutio

    Using Shifts in Amino Acid Frequency and Substitution Rate to Identify Latent Structural Characters in Base-Excision Repair Enzymes

    Get PDF
    Protein evolution includes the birth and death of structural motifs. For example, a zinc finger or a salt bridge may be present in some, but not all, members of a protein family. We propose that such transitions are manifest in sequence phylogenies as concerted shifts in substitution rates of amino acids that are neighbors in a representative structure. First, we identified rate shifts in a quartet from the Fpg/Nei family of base excision repair enzymes using a method developed by Xun Gu and coworkers. We found the shifts to be spatially correlated, more precisely, associated with a flexible loop involved in bacterial Fpg substrate specificity. Consistent with our result, sequences and structures provide convincing evidence that this loop plays a very different role in other family members. Second, then, we developed a method for identifying latent protein structural characters (LSC) given a set of homologous sequences based on Gu's method and proximity in a high-resolution structure. Third, we identified LSC and assigned states of LSC to clades within the Fpg/Nei family of base excision repair enzymes. We describe seven LSC; an accompanying Proteopedia page (http://proteopedia.org/wiki/index.php/Fpg_Nei_Protein_Family) describes these in greater detail and facilitates 3D viewing. The LSC we found provided a surprisingly complete picture of the interaction of the protein with the DNA capturing familiar examples, such as a Zn finger, as well as more subtle interactions. Their preponderance is consistent with an important role as phylogenetic characters. Phylogenetic inference based on LSC provided convincing evidence of independent losses of Zn fingers. Structural motifs may serve as important phylogenetic characters and modeling transitions involving structural motifs may provide a much deeper understanding of protein evolution

    Tandemly repeated DNA families in the mouse genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional and morphological studies of tandem DNA repeats, that combine high portion of most genomes, are mostly limited due to the incomplete characterization of these genome elements. We report here a genome wide analysis of the large tandem repeats (TR) found in the mouse genome assemblies.</p> <p>Results</p> <p>Using a bioinformatics approach, we identified large TR with array size more than 3 kb in two mouse whole genome shotgun (WGS) assemblies. Large TR were classified based on sequence similarity, chromosome position, monomer length, array variability, and GC content; we identified four superfamilies, eight families, and 62 subfamilies - including 60 not previously described. 1) The superfamily of centromeric minor satellite is only found in the unassembled part of the reference genome. 2) The pericentromeric major satellite is the most abundant superfamily and reveals high order repeat structure. 3) Transposable elements related superfamily contains two families. 4) The superfamily of heterogeneous tandem repeats includes four families. One family is found only in the WGS, while two families represent tandem repeats with either single or multi locus location. Despite multi locus location, TRPC-21A-MM is placed into a separated family due to its abundance, strictly pericentromeric location, and resemblance to big human satellites.</p> <p>To confirm our data, we next performed <it>in situ </it>hybridization with three repeats from distinct families. TRPC-21A-MM probe hybridized to chromosomes 3 and 17, multi locus TR-22A-MM probe hybridized to ten chromosomes, and single locus TR-54B-MM probe hybridized with the long loops that emerge from chromosome ends. In addition to <it>in silico </it>predicted several extra-chromosomes were positive for TR by <it>in situ </it>analysis, potentially indicating inaccurate genome assembly of the heterochromatic genome regions.</p> <p>Conclusions</p> <p>Chromosome-specific TR had been predicted for mouse but no reliable cytogenetic probes were available before. We report new analysis that identified <it>in silico </it>and confirmed <it>in situ </it>3/17 chromosome-specific probe TRPC-21-MM. Thus, the new classification had proven to be useful tool for continuation of genome study, while annotated TR can be the valuable source of cytogenetic probes for chromosome recognition.</p

    Rapid Sampling of Molecular Motions with Prior Information Constraints

    Get PDF
    Proteins are active, flexible machines that perform a range of different functions. Innovative experimental approaches may now provide limited partial information about conformational changes along motion pathways of proteins. There is therefore a need for computational approaches that can efficiently incorporate prior information into motion prediction schemes. In this paper, we present PathRover, a general setup designed for the integration of prior information into the motion planning algorithm of rapidly exploring random trees (RRT). Each suggested motion pathway comprises a sequence of low-energy clash-free conformations that satisfy an arbitrary number of prior information constraints. These constraints can be derived from experimental data or from expert intuition about the motion. The incorporation of prior information is very straightforward and significantly narrows down the vast search in the typically high-dimensional conformational space, leading to dramatic reduction in running time. To allow the use of state-of-the-art energy functions and conformational sampling, we have integrated this framework into Rosetta, an accurate protocol for diverse types of structural modeling. The suggested framework can serve as an effective complementary tool for molecular dynamics, Normal Mode Analysis, and other prevalent techniques for predicting motion in proteins. We applied our framework to three different model systems. We show that a limited set of experimentally motivated constraints may effectively bias the simulations toward diverse predicates in an outright fashion, from distance constraints to enforcement of loop closure. In particular, our analysis sheds light on mechanisms of protein domain swapping and on the role of different residues in the motion