136 research outputs found

    Phylogeny of Echinoderm Hemoglobins

    Get PDF
    Recent genomic information has revealed that neuroglobin and cytoglobin are the two principal lineages of vertebrate hemoglobins, with the latter encompassing the familiar myoglobin and α-globin/β-globin tetramer hemoglobin, and several minor groups. In contrast, very little is known about hemoglobins in echinoderms, a phylum of exclusively marine organisms closely related to vertebrates, beyond the presence of coelomic hemoglobins in sea cucumbers and brittle stars. We identified about 50 hemoglobins in sea urchin, starfish and sea cucumber genomes and transcriptomes, and used Bayesian inference to carry out a molecular phylogenetic analysis of their relationship to vertebrate sequences, specifically, to assess the hypothesis that the neuroglobin and cytoglobin lineages are also present in echinoderms.The genome of the sea urchin Strongylocentrotus purpuratus encodes several hemoglobins, including a unique chimeric 14-domain globin, 2 androglobin isoforms and a unique single androglobin domain protein. Other strongylocentrotid genomes appear to have similar repertoires of globin genes. We carried out molecular phylogenetic analyses of 52 hemoglobins identified in sea urchin, brittle star and sea cucumber genomes and transcriptomes, using different multiple sequence alignment methods coupled with Bayesian and maximum likelihood approaches. The results demonstrate that there are two major globin lineages in echinoderms, which are related to the vertebrate neuroglobin and cytoglobin lineages. Furthermore, the brittle star and sea cucumber coelomic hemoglobins appear to have evolved independently from the cytoglobin lineage, similar to the evolution of erythroid oxygen binding globins in cyclostomes and vertebrates.The presence of echinoderm globins related to the vertebrate neuroglobin and cytoglobin lineages suggests that the split between neuroglobins and cytoglobins occurred in the deuterostome ancestor shared by echinoderms and vertebrates

    Analysis of nucleoside-binding proteins by ligand-specific elution from dye resin: application to Mycobacterium tuberculosis aldehyde dehydrogenases

    Get PDF
    We show that Cibacron Blue F3GA dye resin chromatography can be used to identify ligands that specifically interact with proteins from Mycobacterium tuberculosis, and that the identification of these ligands can facilitate structure determination by enhancing the quality of crystals. Four native Mtb proteins of the aldehyde dehydrogenase (ALDH) family were previously shown to be specifically eluted from a Cibacron Blue F3GA dye resin with nucleosides. In this study we characterized the nucleoside-binding specificity of one of these ALDH isozymes (recombinant Mtb Rv0223c) and compared these biochemical results with co-crystallization experiments with different Rv0223c-nucleoside pairings. We found that the strongly interacting ligands (NAD and NADH) aided formation of high-quality crystals, permitting solution of the first Mtb ALDH (Rv0223c) structure. Other nucleoside ligands (AMP, FAD, adenosine, GTP and NADP) exhibited weaker binding to Rv0223c, and produced co-crystals diffracting to lower resolution. Difference electron density maps based on crystals of Rv0223c with various nucleoside ligands show most share the binding site where the natural ligand NAD binds. From the high degree of similarity of sequence and structure compared to human mitochondrial ALDH-2 (BLAST Z-score = 53.5 and RMSD = 1.5 Å), Rv0223c appears to belong to the ALDH-2 class. An altered oligomerization domain in the Rv0223c structure seems to keep this protein as monomer whereas native human ALDH-2 is a multimer

    Structure and mechanism of human DNA polymerase η

    Get PDF
    The variant form of the human syndrome xeroderma pigmentosum (XPV) is caused by a deficiency in DNA polymerase eta (Pol eta), a DNA polymerase that enables replication through ultraviolet-induced pyrimidine dimers. Here we report high-resolution crystal structures of human Pol eta at four consecutive steps during DNA synthesis through cis-syn cyclobutane thymine dimers. Pol eta acts like a 'molecular splint' to stabilize damaged DNA in a normal B-form conformation. An enlarged active site accommodates the thymine dimer with excellent stereochemistry for two-metal ion catalysis. Two residues conserved among Pol eta orthologues form specific hydrogen bonds with the lesion and the incoming nucleotide to assist translesion synthesis. On the basis of the structures, eight Pol eta missense mutations causing XPV can be rationalized as undermining the molecular splint or perturbing the active-site alignment. The structures also provide an insight into the role of Pol eta in replicating through D loop and DNA fragile sites

    Extreme genetic fragility of the HIV-1 capsid

    Get PDF
    Genetic robustness, or fragility, is defined as the ability, or lack thereof, of a biological entity to maintain function in the face of mutations. Viruses that replicate via RNA intermediates exhibit high mutation rates, and robustness should be particularly advantageous to them. The capsid (CA) domain of the HIV-1 Gag protein is under strong pressure to conserve functional roles in viral assembly, maturation, uncoating, and nuclear import. However, CA is also under strong immunological pressure to diversify. Therefore, it would be particularly advantageous for CA to evolve genetic robustness. To measure the genetic robustness of HIV-1 CA, we generated a library of single amino acid substitution mutants, encompassing almost half the residues in CA. Strikingly, we found HIV-1 CA to be the most genetically fragile protein that has been analyzed using such an approach, with 70% of mutations yielding replication-defective viruses. Although CA participates in several steps in HIV-1 replication, analysis of conditionally (temperature sensitive) and constitutively non-viable mutants revealed that the biological basis for its genetic fragility was primarily the need to coordinate the accurate and efficient assembly of mature virions. All mutations that exist in naturally occurring HIV-1 subtype B populations at a frequency >3%, and were also present in the mutant library, had fitness levels that were >40% of WT. However, a substantial fraction of mutations with high fitness did not occur in natural populations, suggesting another form of selection pressure limiting variation in vivo. Additionally, known protective CTL epitopes occurred preferentially in domains of the HIV-1 CA that were even more genetically fragile than HIV-1 CA as a whole. The extreme genetic fragility of HIV-1 CA may be one reason why cell-mediated immune responses to Gag correlate with better prognosis in HIV-1 infection, and suggests that CA is a good target for therapy and vaccination strategies

    Genetic linkage analysis in the age of whole-genome sequencing

    Get PDF
    For many years, linkage analysis was the primary tool used for the genetic mapping of Mendelian and complex traits with familial aggregation. Linkage analysis was largely supplanted by the wide adoption of genome-wide association studies (GWASs). However, with the recent increased use of whole-genome sequencing (WGS), linkage analysis is again emerging as an important and powerful analysis method for the identification of genes involved in disease aetiology, often in conjunction with WGS filtering approaches. Here, we review the principles of linkage analysis and provide practical guidelines for carrying out linkage studies using WGS data

    Towards a comprehensive structural coverage of completed genomes: a structural genomics viewpoint

    Get PDF
    BACKGROUND: Structural genomics initiatives were established with the aim of solving protein structures on a large-scale. For many initiatives, such as the Protein Structure Initiative (PSI), the primary aim of target selection is focussed towards structurally characterising protein families which, so far, lack a structural representative. It is therefore of considerable interest to gain insights into the number and distribution of these families, and what efforts may be required to achieve a comprehensive structural coverage across all protein families. RESULTS: In this analysis we have derived a comprehensive domain annotation of the genomes using CATH, Pfam-A and Newfam domain families. We consider what proportions of structurally uncharacterised families are accessible to high-throughput structural genomics pipelines, specifically those targeting families containing multiple prokaryotic orthologues. In measuring the domain coverage of the genomes, we show the benefits of selecting targets from both structurally uncharacterised domain families, whilst in addition, pursuing additional targets from large structurally characterised protein superfamilies. CONCLUSION: This work suggests that such a combined approach to target selection is essential if structural genomics is to achieve a comprehensive structural coverage of the genomes, leading to greater insights into structure and the mechanisms that underlie protein evolution

    Bioinformatics and Structural Characterization of a Hypothetical Protein from Streptococcus mutans: Implication of Antibiotic Resistance

    Get PDF
    As an oral bacterial pathogen, Streptococcus mutans has been known as the aetiologic agent of human dental caries. Among a total of 1960 identified proteins within the genome of this organism, there are about 500 without any known functions. One of these proteins, SMU.440, has very few homologs in the current protein databases and it does not fall into any protein functional families. Phylogenetic studies showed that SMU.440 is related to a particular ecological niche and conserved specifically in some oral pathogens, due to lateral gene transfer. The co-occurrence of a MarR protein within the same operon among these oral pathogens suggests that SMU.440 may be associated with antibiotic resistance. The structure determination of SMU.440 revealed that it shares the same fold and a similar pocket as polyketide cyclases, which indicated that it is very likely to bind some polyketide-like molecules. From the interlinking structural and bioinformatics studies, we have concluded that SMU.440 could be involved in polyketide-like antibiotic resistance, providing a better understanding of this hypothetical protein. Besides, the combination of multiple methods in this study can be used as a general approach for functional studies of a protein with unknown function

    Haplotype association analyses in resources of mixed structure using Monte Carlo testing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genomewide association studies have resulted in a great many genomic regions that are likely to harbor disease genes. Thorough interrogation of these specific regions is the logical next step, including regional haplotype studies to identify risk haplotypes upon which the underlying critical variants lie. Pedigrees ascertained for disease can be powerful for genetic analysis due to the cases being enriched for genetic disease. Here we present a Monte Carlo based method to perform haplotype association analysis. Our method, hapMC, allows for the analysis of full-length and sub-haplotypes, including imputation of missing data, in resources of nuclear families, general pedigrees, case-control data or mixtures thereof. Both traditional association statistics and transmission/disequilibrium statistics can be performed. The method includes a phasing algorithm that can be used in large pedigrees and optional use of pseudocontrols.</p> <p>Results</p> <p>Our new phasing algorithm substantially outperformed the standard expectation-maximization algorithm that is ignorant of pedigree structure, and hence is preferable for resources that include pedigree structure. Through simulation we show that our Monte Carlo procedure maintains the correct type 1 error rates for all resource types. Power comparisons suggest that transmission-disequilibrium statistics are superior for performing association in resources of only nuclear families. For mixed structure resources, however, the newly implemented pseudocontrol approach appears to be the best choice. Results also indicated the value of large high-risk pedigrees for association analysis, which, in the simulations considered, were comparable in power to case-control resources of the same sample size.</p> <p>Conclusions</p> <p>We propose hapMC as a valuable new tool to perform haplotype association analyses, particularly for resources of mixed structure. The availability of meta-association and haplotype-mining modules in our suite of Monte Carlo haplotype procedures adds further value to the approach.</p
    corecore