447 research outputs found

    CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods.</p> <p>Results</p> <p>The evaluation of CMASA shows that the CMASA is highly accurate (0.96), sensitive (0.86), and fast enough to be used in the large-scale functional annotation. Comparing to both sequence-based and global structure-based methods, not only the CMASA can find remote homologous proteins, but also can find the active site convergence. Comparing to other local structure comparison-based methods, the CMASA can obtain the better performance than both FFF (a method using geometry to predict protein function) and SPASM (a local structure alignment method); and the CMASA is more sensitive than PINTS and is more accurate than JESS (both are local structure alignment methods). The CMASA was applied to annotate the enzyme catalytic sites of the non-redundant PDB, and at least 166 putative catalytic sites have been suggested, these sites can not be observed by the Catalytic Site Atlas (CSA).</p> <p>Conclusions</p> <p>The CMASA is an accurate algorithm for detecting local protein structural similarity, and it holds several advantages in predicting enzyme active sites. The CMASA can be used in large-scale enzyme active site annotation. The CMASA can be available by the mail-based server (<url>http://159.226.149.45/other1/CMASA/CMASA.htm</url>).</p

    Evolutionary relationships among barley and <i>Arabidopsis</i> core circadian clock and clock-associated genes

    Get PDF
    The circadian clock regulates a multitude of plant developmental and metabolic processes. In crop species, it contributes significantly to plant performance and productivity and to the adaptation and geographical range over which crops can be grown. To understand the clock in barley and how it relates to the components in the Arabidopsis thaliana clock, we have performed a systematic analysis of core circadian clock and clock-associated genes in barley, Arabidopsis and another eight species including tomato, potato, a range of monocotyledonous species and the moss, Physcomitrella patens. We have identified orthologues and paralogues of Arabidopsis genes which are conserved in all species, monocot/dicot differences, species-specific differences and variation in gene copy number (e.g. gene duplications among the various species). We propose that the common ancestor of barley and Arabidopsis had two-thirds of the key clock components identified in Arabidopsis prior to the separation of the monocot/dicot groups. After this separation, multiple independent gene duplication events took place in both monocot and dicot ancestors. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1007/s00239-015-9665-0) contains supplementary material, which is available to authorized users

    Structural Annotation of Mycobacterium tuberculosis Proteome

    Get PDF
    Of the ∼4000 ORFs identified through the genome sequence of Mycobacterium tuberculosis (TB) H37Rv, experimentally determined structures are available for 312. Since knowledge of protein structures is essential to obtain a high-resolution understanding of the underlying biology, we seek to obtain a structural annotation for the genome, using computational methods. Structural models were obtained and validated for ∼2877 ORFs, covering ∼70% of the genome. Functional annotation of each protein was based on fold-based functional assignments and a novel binding site based ligand association. New algorithms for binding site detection and genome scale binding site comparison at the structural level, recently reported from the laboratory, were utilized. Besides these, the annotation covers detection of various sequence and sub-structural motifs and quaternary structure predictions based on the corresponding templates. The study provides an opportunity to obtain a global perspective of the fold distribution in the genome. The annotation indicates that cellular metabolism can be achieved with only 219 folds. New insights about the folds that predominate in the genome, as well as the fold-combinations that make up multi-domain proteins are also obtained. 1728 binding pockets have been associated with ligands through binding site identification and sub-structure similarity analyses. The resource (http://proline.physics.iisc.ernet.in/Tbstructuralannotation), being one of the first to be based on structure-derived functional annotations at a genome scale, is expected to be useful for better understanding of TB and for application in drug discovery. The reported annotation pipeline is fairly generic and can be applied to other genomes as well

    Extreme genetic fragility of the HIV-1 capsid

    Get PDF
    Genetic robustness, or fragility, is defined as the ability, or lack thereof, of a biological entity to maintain function in the face of mutations. Viruses that replicate via RNA intermediates exhibit high mutation rates, and robustness should be particularly advantageous to them. The capsid (CA) domain of the HIV-1 Gag protein is under strong pressure to conserve functional roles in viral assembly, maturation, uncoating, and nuclear import. However, CA is also under strong immunological pressure to diversify. Therefore, it would be particularly advantageous for CA to evolve genetic robustness. To measure the genetic robustness of HIV-1 CA, we generated a library of single amino acid substitution mutants, encompassing almost half the residues in CA. Strikingly, we found HIV-1 CA to be the most genetically fragile protein that has been analyzed using such an approach, with 70% of mutations yielding replication-defective viruses. Although CA participates in several steps in HIV-1 replication, analysis of conditionally (temperature sensitive) and constitutively non-viable mutants revealed that the biological basis for its genetic fragility was primarily the need to coordinate the accurate and efficient assembly of mature virions. All mutations that exist in naturally occurring HIV-1 subtype B populations at a frequency &gt;3%, and were also present in the mutant library, had fitness levels that were &gt;40% of WT. However, a substantial fraction of mutations with high fitness did not occur in natural populations, suggesting another form of selection pressure limiting variation in vivo. Additionally, known protective CTL epitopes occurred preferentially in domains of the HIV-1 CA that were even more genetically fragile than HIV-1 CA as a whole. The extreme genetic fragility of HIV-1 CA may be one reason why cell-mediated immune responses to Gag correlate with better prognosis in HIV-1 infection, and suggests that CA is a good target for therapy and vaccination strategies

    Exploiting protein flexibility to predict the location of allosteric sites

    Get PDF
    Background: Allostery is one of the most powerful and common ways of regulation of protein activity. However, for most allosteric proteins identified to date the mechanistic details of allosteric modulation are not yet well understood. Uncovering common mechanistic patterns underlying allostery would allow not only a better academic understanding of the phenomena, but it would also streamline the design of novel therapeutic solutions. This relatively unexplored therapeutic potential and the putative advantages of allosteric drugs over classical active-site inhibitors fuel the attention allosteric-drug research is receiving at present. A first step to harness the regulatory potential and versatility of allosteric sites, in the context of drug-discovery and design, would be to detect or predict their presence and location. In this article, we describe a simple computational approach, based on the effect allosteric ligands exert on protein flexibility upon binding, to predict the existence and position of allosteric sites on a given protein structure. Results: By querying the literature and a recently available database of allosteric sites, we gathered 213 allosteric proteins with structural information that we further filtered into a non-redundant set of 91 proteins. We performed normal-mode analysis and observed significant changes in protein flexibility upon allosteric-ligand binding in 70% of the cases. These results agree with the current view that allosteric mechanisms are in many cases governed by changes in protein dynamics caused by ligand binding. Furthermore, we implemented an approach that achieves 65% positive predictive value in identifying allosteric sites within the set of predicted cavities of a protein (stricter parameters set, 0.22 sensitivity), by combining the current analysis on dynamics with previous results on structural conservation of allosteric sites. We also analyzed four biological examples in detail, revealing that this simple coarse-grained methodology is able to capture the effects triggered by allosteric ligands already described in the literature. Conclusions: We introduce a simple computational approach to predict the presence and position of allosteric sites in a protein based on the analysis of changes in protein normal modes upon the binding of a coarse-grained ligand at predicted cavities. Its performance has been demonstrated using a newly curated non-redundant set of 91 proteins with reported allosteric properties. The software developed in this work is available upon request from the authors

    Measurement of the cross-section of high transverse momentum vector bosons reconstructed as single jets and studies of jet substructure in pp collisions at √s = 7 TeV with the ATLAS detector

    Get PDF
    This paper presents a measurement of the cross-section for high transverse momentum W and Z bosons produced in pp collisions and decaying to all-hadronic final states. The data used in the analysis were recorded by the ATLAS detector at the CERN Large Hadron Collider at a centre-of-mass energy of √s = 7 TeV;{\rm Te}{\rm V}andcorrespondtoanintegratedluminosityof and correspond to an integrated luminosity of 4.6\;{\rm f}{{{\rm b}}^{-1}}.ThemeasurementisperformedbyreconstructingtheboostedWorZbosonsinsinglejets.ThereconstructedjetmassisusedtoidentifytheWandZbosons,andajetsubstructuremethodbasedonenergyclusterinformationinthejetcentreofmassframeisusedtosuppressthelargemultijetbackground.ThecrosssectionforeventswithahadronicallydecayingWorZboson,withtransversemomentum. The measurement is performed by reconstructing the boosted W or Z bosons in single jets. The reconstructed jet mass is used to identify the W and Z bosons, and a jet substructure method based on energy cluster information in the jet centre-of-mass frame is used to suppress the large multi-jet background. The cross-section for events with a hadronically decaying W or Z boson, with transverse momentum {{p}_{{\rm T}}}\gt 320\;{\rm Ge}{\rm V}andpseudorapidity and pseudorapidity |\eta |\lt 1.9,ismeasuredtobe, is measured to be {{\sigma }_{W+Z}}=8.5\pm 1.7$ pb and is compared to next-to-leading-order calculations. The selected events are further used to study jet grooming techniques

    Search for pair-produced long-lived neutral particles decaying to jets in the ATLAS hadronic calorimeter in ppcollisions at √s=8TeV

    Get PDF
    The ATLAS detector at the Large Hadron Collider at CERN is used to search for the decay of a scalar boson to a pair of long-lived particles, neutral under the Standard Model gauge group, in 20.3fb−1of data collected in proton–proton collisions at √s=8TeV. This search is sensitive to long-lived particles that decay to Standard Model particles producing jets at the outer edge of the ATLAS electromagnetic calorimeter or inside the hadronic calorimeter. No significant excess of events is observed. Limits are reported on the product of the scalar boson production cross section times branching ratio into long-lived neutral particles as a function of the proper lifetime of the particles. Limits are reported for boson masses from 100 GeVto 900 GeV, and a long-lived neutral particle mass from 10 GeVto 150 GeV

    Search for direct pair production of the top squark in all-hadronic final states in proton-proton collisions at s√=8 TeV with the ATLAS detector

    Get PDF
    The results of a search for direct pair production of the scalar partner to the top quark using an integrated luminosity of 20.1fb−1 of proton–proton collision data at √s = 8 TeV recorded with the ATLAS detector at the LHC are reported. The top squark is assumed to decay via t˜→tχ˜01 or t˜→ bχ˜±1 →bW(∗)χ˜01 , where χ˜01 (χ˜±1 ) denotes the lightest neutralino (chargino) in supersymmetric models. The search targets a fully-hadronic final state in events with four or more jets and large missing transverse momentum. No significant excess over the Standard Model background prediction is observed, and exclusion limits are reported in terms of the top squark and neutralino masses and as a function of the branching fraction of t˜ → tχ˜01 . For a branching fraction of 100%, top squark masses in the range 270–645 GeV are excluded for χ˜01 masses below 30 GeV. For a branching fraction of 50% to either t˜ → tχ˜01 or t˜ → bχ˜±1 , and assuming the χ˜±1 mass to be twice the χ˜01 mass, top squark masses in the range 250–550 GeV are excluded for χ˜01 masses below 60 GeV

    Genomic tools development for Aquilegia: construction of a BAC-based physical map

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genus <it>Aquilegia</it>, consisting of approximately 70 taxa, is a member of the basal eudicot lineage, Ranuculales, which is evolutionarily intermediate between monocots and core eudicots, and represents a relatively unstudied clade in the angiosperm phylogenetic tree that bridges the gap between these two major plant groups. <it>Aquilegia </it>species are closely related and their distribution covers highly diverse habitats. These provide rich resources to better understand the genetic basis of adaptation to different pollinators and habitats that in turn leads to rapid speciation. To gain insights into the genome structure and facilitate gene identification, comparative genomics and whole-genome shotgun sequencing assembly, BAC-based genomics resources are of crucial importance.</p> <p>Results</p> <p>BAC-based genomic resources, including two BAC libraries, a physical map with anchored markers and BAC end sequences, were established from <it>A. formosa</it>. The physical map was composed of a total of 50,155 BAC clones in 832 contigs and 3939 singletons, covering 21X genome equivalents. These contigs spanned a physical length of 689.8 Mb (~2.3X of the genome) suggesting the complex heterozygosity of the genome. A set of 197 markers was developed from ESTs induced by drought-stress, or involved in anthocyanin biosynthesis or floral development, and was integrated into the physical map. Among these were 87 genetically mapped markers that anchored 54 contigs, spanning 76.4 Mb (25.5%) across the genome. Analysis of a selection of 12,086 BAC end sequences (BESs) from the minimal tiling path (MTP) allowed a preview of the <it>Aquilegia </it>genome organization, including identification of transposable elements, simple sequence repeats and gene content. Common repetitive elements previously reported in both monocots and core eudicots were identified in <it>Aquilegia </it>suggesting the value of this genome in connecting the two major plant clades. Comparison with sequenced plant genomes indicated a higher similarity to grapevine (<it>Vitis vinifera</it>) than to rice and <it>Arabidopsis </it>in the transcriptomes.</p> <p>Conclusions</p> <p>The <it>A. formosa </it>BAC-based genomic resources provide valuable tools to study <it>Aquilegia </it>genome. Further integration of other existing genomics resources, such as ESTs, into the physical map should enable better understanding of the molecular mechanisms underlying adaptive radiation and elaboration of floral morphology.</p

    The role of hypothalamic H1 receptor antagonism in antipsychotic-induced weight gain

    Get PDF
    Treatment with second generation antipsychotics (SGAs), notably olanzapine and clozapine, causes severe obesity side effects. Antagonism of histamine H1 receptors has been identified as a main cause of SGA-induced obesity, but the molecular mechanisms associated with this antagonism in different stages of SGA-induced weight gain remain unclear. This review aims to explore the potential role of hypothalamic histamine H1 receptors in different stages of SGA-induced weight gain/obesity and the molecular pathways related to SGA-induced antagonism of these receptors. Initial data have demonstrated the importance of hypothalamic H1 receptors in both short- and long-term SGA-induced obesity. Blocking hypothalamic H1 receptors by SGAs activates AMP-activated protein kinase (AMPK), a well-known feeding regulator. During short-term treatment, hypothalamic H1 receptor antagonism by SGAs may activate the AMPK—carnitine palmitoyltransferase 1 signaling to rapidly increase caloric intake and result in weight gain. During long-term SGA treatment, hypothalamic H1 receptor antagonism can reduce thermogenesis, possibly by inhibiting the sympathetic outflows to the brainstem rostral raphe pallidus and rostral ventrolateral medulla, therefore decreasing brown adipose tissue thermogenesis. Additionally, blocking of hypothalamic H1 receptors by SGAs may also contribute to fat accumulation by decreasing lipolysis but increasing lipogenesis in white adipose tissue. In summary, antagonism of hypothalamic H1 receptors by SGAs may time-dependently affect the hypothalamus-brainstem circuits to cause weight gain by stimulating appetite and fat accumulation but reducing energy expenditure. The H1 receptor and its downstream signaling molecules could be valuable targets for the design of new compounds for treating SGA-induced weight gain/obesity
    corecore