8 research outputs found

    Census of halide-binding sites in protein structures

    Get PDF
    Motivation: Halides are negatively charged ions of halogens, forming fluorides (F-), chlorides (Cl-), bromides (Br-) and iodides (I-). These anions are quite reactive and interact both specifically and non-specifically with proteins. Despite their ubiquitous presence and important roles in protein function, little is known about the preferences of halides binding to proteins. To address this problem, we performed the analysis of halide-protein interactions, based on the entries in the Protein Data Bank. Results: We have compiled a pipeline for the quick analysis of halide-binding sites in proteins using the available software. Our analysis revealed that all of halides are strongly attracted by the guanidinium moiety of arginine side chains, however, there are also certain preferences among halides for other partners. Furthermore, there is a certain preference for coordination numbers in the binding sites, with a correlation between coordination numbers and amino acid composition. This pipeline can be used as a tool for the analysis of specific halide-protein interactions and assist phasing experiments relying on halides as anomalous scatters

    No More Tears: Mining Sequencing Data for Novel Bt Cry Toxins with CryProcessor

    No full text
    Bacillus thuringiensis (Bt) is a natural pathogen of insects and some other groups of invertebrates that produces three-domain Cry (3d-Cry) toxins, which are highly host-specific pesticidal proteins. These proteins represent the most commonly used bioinsecticides in the world and are used for commercial purposes on the market of insecticides, being convergent with the paradigm of sustainable growth and ecological development. Emerging resistance to known toxins in pests stresses the need to expand the list of known toxins to broaden the horizons of insecticidal approaches. For this purpose, we have elaborated a fast and user-friendly tool called CryProcessor, which allows productive and precise mining of 3d-Cry toxins. The only existing tool for mining Cry toxins, called a BtToxin_scanner, has significant limitations such as limited query size, lack of accuracy and an outdated database. In order to find a proper solution to these problems, we have developed a robust pipeline, capable of precise 3d-Cry toxin mining. The unique feature of the pipeline is the ability to search for Cry toxins sequences directly on assembly graphs, providing an opportunity to analyze raw sequencing data and overcoming the problem of fragmented assemblies. Moreover, CryProcessor is able to predict precisely the domain layout in arbitrary sequences, allowing the retrieval of sequences of definite domains beyond the bounds of a limited number of toxins presented in CryGetter. Our algorithm has shown efficiency in all its work modes and outperformed its analogues on large amounts of data. Here, we describe its main features and provide information on its benchmarking against existing analogues. CryProcessor is a novel, fast, convenient, open source (https://github.com/lab7arriam/cry_processor), platform-independent, and precise instrument with a console version and elaborated web interface (https://lab7.arriam.ru/tools/cry_processor). Its major merits could make it possible to carry out massive screening for novel 3d-Cry toxins and obtain sequences of specific domains for further comprehensive in silico experiments in constructing artificial toxins

    The Role of NOTCH Pathway Genes in the Inherited Susceptibility to Aortic Stenosis

    No full text
    The NOTCH-signaling pathway is responsible for intercellular interactions and cell fate commitment. Recently, NOTCH pathway genes were demonstrated to play an important role in aortic valve development, leading to an increased calcified aortic valve disease (CAVD) later in life. Here, we further investigate the association between genetic variants in the NOTCH pathway genes and aortic stenosis in a case–control study of 90 CAVD cases and 4723 controls using target panel sequencing of full-length 20 genes from a NOTCH-related pathway (DVL2, DTX2, MFNG, NUMBL, LFNG, DVL1, DTX4, APH1A, DTX1, APH1B, NOTCH1, ADAM17, DVL3, NCSTN, DTX3L, ILK, RFNG, DTX3, NOTCH4, PSENEN). We identified a common intronic variant in NOTCH1, protecting against CAVD development (rs3812603), as well as several rare and unique new variants in NOTCH-pathway genes (DTX4, NOTCH1, DTX1, DVL2, NOTCH1, DTX3L, DVL3), with a prominent effect of the protein structure and function

    Whole-exome sequencing provides insights into monogenic disease prevalence in Northwest Russia

    No full text
    Abstract Background Allele frequency data from large exome and genome aggregation projects such as the Genome Aggregation Database (gnomAD) are of ultimate importance to the interpretation of medical resequencing data. However, allele frequencies might significantly differ in poorly studied populations that are underrepresented in large‐scale projects, such as the Russian population. Methods In this work, we leveraged our access to a large dataset of 694 exome samples to analyze genetic variation in the Northwest Russia. We compared the spectrum of genetic variants to the dbSNP build 151, and made estimates of ClinVar‐based autosomal recessive (AR) disease allele prevalence as compared to gnomAD r. 2.1. Results An estimated 9.3% of discovered variants were not present in dbSNP. We report statistically significant overrepresentation of pathogenic variants for several Mendelian disorders, including phenylketonuria (PAH, rs5030858), Wilson's disease (ATP7B, rs76151636), factor VII deficiency (F7, rs36209567), kyphoscoliosis type of Ehlers‐Danlos syndrome (FKBP14, rs542489955), and several other recessive pathologies. We also make primary estimates of monogenic disease incidence in the population, with retinal dystrophy, cystic fibrosis, and phenylketonuria being the most frequent AR pathologies. Conclusion Our observations demonstrate the utility of population‐specific allele frequency data to the diagnosis of monogenic disorders using high‐throughput technologies

    Complex trait susceptibilities and population diversity in a sample of 4,145 Russians

    No full text
    Abstract The population of Russia consists of more than 150 local ethnicities. The ethnic diversity and geographic origins, which extend from eastern Europe to Asia, make the population uniquely positioned to investigate the shared properties of inherited disease risks between European and Asian ancestries. We present the analysis of genetic and phenotypic data from a cohort of 4,145 individuals collected in three metro areas in western Russia. We show the presence of multiple admixed genetic ancestry clusters spanning from primarily European to Asian and high identity-by-descent sharing with the Finnish population. As a result, there was notable enrichment of Finnish-specific variants in Russia. We illustrate the utility of Russian-descent cohorts for discovery of novel population-specific genetic associations, as well as replication of previously identified associations that were thought to be population-specific in other cohorts. Finally, we provide access to a database of allele frequencies and GWAS results for 464 phenotypes
    corecore