135 research outputs found
Structure and mechanism of human DNA polymerase η
The variant form of the human syndrome xeroderma pigmentosum (XPV) is caused by a deficiency in DNA polymerase eta (Pol eta), a DNA polymerase that enables replication through ultraviolet-induced pyrimidine dimers. Here we report high-resolution crystal structures of human Pol eta at four consecutive steps during DNA synthesis through cis-syn cyclobutane thymine dimers. Pol eta acts like a 'molecular splint' to stabilize damaged DNA in a normal B-form conformation. An enlarged active site accommodates the thymine dimer with excellent stereochemistry for two-metal ion catalysis. Two residues conserved among Pol eta orthologues form specific hydrogen bonds with the lesion and the incoming nucleotide to assist translesion synthesis. On the basis of the structures, eight Pol eta missense mutations causing XPV can be rationalized as undermining the molecular splint or perturbing the active-site alignment. The structures also provide an insight into the role of Pol eta in replicating through D loop and DNA fragile sites
Structure and evolutionary origin of Ca2+-dependent herring type II antifreeze protein
10.1371/journal.pone.0000548PLoS ONE26
The genome of the sea urchin Strongylocentrotus purpuratus
We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus
purpuratus, a model for developmental and systems biology. The sequencing strategy combined
whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones,
aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome.
The genome encodes about 23,300 genes, including many previously thought to be vertebrate
innovations or known only outside the deuterostomes. This echinoderm genome provides an
evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes
Analysis of nucleoside-binding proteins by ligand-specific elution from dye resin: application to Mycobacterium tuberculosis aldehyde dehydrogenases
We show that Cibacron Blue F3GA dye resin chromatography can be used to identify ligands that specifically interact with proteins from Mycobacterium tuberculosis, and that the identification of these ligands can facilitate structure determination by enhancing the quality of crystals. Four native Mtb proteins of the aldehyde dehydrogenase (ALDH) family were previously shown to be specifically eluted from a Cibacron Blue F3GA dye resin with nucleosides. In this study we characterized the nucleoside-binding specificity of one of these ALDH isozymes (recombinant Mtb Rv0223c) and compared these biochemical results with co-crystallization experiments with different Rv0223c-nucleoside pairings. We found that the strongly interacting ligands (NAD and NADH) aided formation of high-quality crystals, permitting solution of the first Mtb ALDH (Rv0223c) structure. Other nucleoside ligands (AMP, FAD, adenosine, GTP and NADP) exhibited weaker binding to Rv0223c, and produced co-crystals diffracting to lower resolution. Difference electron density maps based on crystals of Rv0223c with various nucleoside ligands show most share the binding site where the natural ligand NAD binds. From the high degree of similarity of sequence and structure compared to human mitochondrial ALDH-2 (BLAST Z-score = 53.5 and RMSD = 1.5 Å), Rv0223c appears to belong to the ALDH-2 class. An altered oligomerization domain in the Rv0223c structure seems to keep this protein as monomer whereas native human ALDH-2 is a multimer
Towards a comprehensive structural coverage of completed genomes: a structural genomics viewpoint
BACKGROUND: Structural genomics initiatives were established with the aim of solving protein structures on a large-scale. For many initiatives, such as the Protein Structure Initiative (PSI), the primary aim of target selection is focussed towards structurally characterising protein families which, so far, lack a structural representative. It is therefore of considerable interest to gain insights into the number and distribution of these families, and what efforts may be required to achieve a comprehensive structural coverage across all protein families. RESULTS: In this analysis we have derived a comprehensive domain annotation of the genomes using CATH, Pfam-A and Newfam domain families. We consider what proportions of structurally uncharacterised families are accessible to high-throughput structural genomics pipelines, specifically those targeting families containing multiple prokaryotic orthologues. In measuring the domain coverage of the genomes, we show the benefits of selecting targets from both structurally uncharacterised domain families, whilst in addition, pursuing additional targets from large structurally characterised protein superfamilies. CONCLUSION: This work suggests that such a combined approach to target selection is essential if structural genomics is to achieve a comprehensive structural coverage of the genomes, leading to greater insights into structure and the mechanisms that underlie protein evolution
Bioinformatics and Structural Characterization of a Hypothetical Protein from Streptococcus mutans: Implication of Antibiotic Resistance
As an oral bacterial pathogen, Streptococcus mutans has been known as the aetiologic agent of human dental caries. Among a total of 1960 identified proteins within the genome of this organism, there are about 500 without any known functions. One of these proteins, SMU.440, has very few homologs in the current protein databases and it does not fall into any protein functional families. Phylogenetic studies showed that SMU.440 is related to a particular ecological niche and conserved specifically in some oral pathogens, due to lateral gene transfer. The co-occurrence of a MarR protein within the same operon among these oral pathogens suggests that SMU.440 may be associated with antibiotic resistance. The structure determination of SMU.440 revealed that it shares the same fold and a similar pocket as polyketide cyclases, which indicated that it is very likely to bind some polyketide-like molecules. From the interlinking structural and bioinformatics studies, we have concluded that SMU.440 could be involved in polyketide-like antibiotic resistance, providing a better understanding of this hypothetical protein. Besides, the combination of multiple methods in this study can be used as a general approach for functional studies of a protein with unknown function
Haplotype association analyses in resources of mixed structure using Monte Carlo testing
<p>Abstract</p> <p>Background</p> <p>Genomewide association studies have resulted in a great many genomic regions that are likely to harbor disease genes. Thorough interrogation of these specific regions is the logical next step, including regional haplotype studies to identify risk haplotypes upon which the underlying critical variants lie. Pedigrees ascertained for disease can be powerful for genetic analysis due to the cases being enriched for genetic disease. Here we present a Monte Carlo based method to perform haplotype association analysis. Our method, hapMC, allows for the analysis of full-length and sub-haplotypes, including imputation of missing data, in resources of nuclear families, general pedigrees, case-control data or mixtures thereof. Both traditional association statistics and transmission/disequilibrium statistics can be performed. The method includes a phasing algorithm that can be used in large pedigrees and optional use of pseudocontrols.</p> <p>Results</p> <p>Our new phasing algorithm substantially outperformed the standard expectation-maximization algorithm that is ignorant of pedigree structure, and hence is preferable for resources that include pedigree structure. Through simulation we show that our Monte Carlo procedure maintains the correct type 1 error rates for all resource types. Power comparisons suggest that transmission-disequilibrium statistics are superior for performing association in resources of only nuclear families. For mixed structure resources, however, the newly implemented pseudocontrol approach appears to be the best choice. Results also indicated the value of large high-risk pedigrees for association analysis, which, in the simulations considered, were comparable in power to case-control resources of the same sample size.</p> <p>Conclusions</p> <p>We propose hapMC as a valuable new tool to perform haplotype association analyses, particularly for resources of mixed structure. The availability of meta-association and haplotype-mining modules in our suite of Monte Carlo haplotype procedures adds further value to the approach.</p
Oestrogen receptor α gene haplotype and postmenopausal breast cancer risk: a case control study
INTRODUCTION: Oestrogen receptor α, which mediates the effect of oestrogen in target tissues, is genetically polymorphic. Because breast cancer development is dependent on oestrogenic influence, we have investigated whether polymorphisms in the oestrogen receptor α gene (ESR1) are associated with breast cancer risk. METHODS: We genotyped breast cancer cases and age-matched population controls for one microsatellite marker and four single-nucleotide polymorphisms (SNPs) in ESR1. The numbers of genotyped cases and controls for each marker were as follows: TA(n), 1514 cases and 1514 controls; c.454-397C → T, 1557 cases and 1512 controls; c.454-351A → G, 1556 cases and 1512 controls; c.729C → T, 1562 cases and 1513 controls; c.975C → G, 1562 cases and 1513 controls. Using logistic regression models, we calculated odds ratios (ORs) and 95% confidence intervals (CIs). Haplotype effects were estimated in an exploratory analysis, using expectation-maximisation algorithms for case-control study data. RESULTS: There were no compelling associations between single polymorphic loci and breast cancer risk. In haplotype analyses, a common haplotype of the c.454-351A → G or c.454-397C → T and c.975C → G SNPs appeared to be associated with an increased risk for ductal breast cancer: one copy of the c.454-351A → G and c.975C → G haplotype entailed an OR of 1.19 (95% CI 1.06–1.33) and two copies with an OR of 1.42 (95% CI 1.15–1.77), compared with no copies, under a model of multiplicative penetrance. The association with the c.454-397C → T and c.975C → G haplotypes was similar. Our data indicated that these haplotypes were more influential in women with a high body mass index. Adjustment for multiple comparisons rendered the associations statistically non-significant. CONCLUSION: We found suggestions of an association between common haplotypes in ESR1 and the risk for ductal breast cancer that is stronger in heavy women
Dimerization of FIR upon FUSE DNA binding suggests a mechanism of c-myc inhibition
c-myc is essential for cell homeostasis and growth but lethal if improperly regulated. Transcription of this oncogene is governed by the counterbalancing forces of two proteins on TFIIH—the FUSE binding protein (FBP) and the FBP-interacting repressor (FIR). FBP and FIR recognize single-stranded DNA upstream of the P1 promoter, known as FUSE, and influence transcription by oppositely regulating TFIIH at the promoter site. Size exclusion chromatography coupled with light scattering reveals that an FIR dimer binds one molecule of single-stranded DNA. The crystal structure confirms that FIR binds FUSE as a dimer, and only the N-terminal RRM domain participates in nucleic acid recognition. Site-directed mutations of conserved residues in the first RRM domain reduce FIR's affinity for FUSE, while analogous mutations in the second RRM domain either destabilize the protein or have no effect on DNA binding. Oppositely oriented DNA on parallel binding sites of the FIR dimer results in spooling of a single strand of bound DNA, and suggests a mechanism for c-myc transcriptional control
- …
