Search CORE

15 research outputs found

A Pipeline for Classifying Deleterious Coding Mutations in Agricultural Plants

Author: Anna A. Igolkina
Maria G. Samsonova
Maxim S. Kovalev
Sergey V. Nuzhdin
Sergey V. Nuzhdin
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

The impact of deleterious variation on both plant fitness and crop productivity is not completely understood and is a hot topic of debates. The deleterious mutations in plants have been solely predicted using sequence conservation methods rather than function-based classifiers due to lack of well-annotated mutational datasets in these organisms. Here, we developed a machine learning classifier based on a dataset of deleterious and neutral mutations in Arabidopsis thaliana by extracting 18 informative features that discriminate deleterious mutations from neutral, including 9 novel features not used in previous studies. We examined linear SVM, Gaussian SVM, and Random Forest classifiers, with the latter performing best. Random Forest classifiers exhibited a markedly higher accuracy than the popular PolyPhen-2 tool in the Arabidopsis dataset. Additionally, we tested whether the Random Forest, trained on the Arabidopsis dataset, accurately predicts deleterious mutations in Orýza sativa and Pisum sativum and observed satisfactory levels of performance accuracy (87% and 93%, respectively) higher than obtained by the PolyPhen-2. Application of Transfer learning in classifiers did not improve their performance. To additionally test the performance of the Random Forest classifier across different angiosperm species, we applied it to annotate deleterious mutations in Cicer arietinum and validated them using population frequency data. Overall, we devised a classifier with the potential to improve the annotation of putative functional mutations in QTL and GWAS hit regions, as well as for the evolutionary analysis of proliferation of deleterious mutations during plant domestication; thus optimizing breeding improvement and development of new cultivars

Directory of Open Access Journals

Frontiers - Publisher Connector

Historical Routes for Diversification of Domesticated Chickpea Inferred from Landrace Genomics.

Author: Igolkina Anna A,
Publication venue
Publication date: 19/07/2023
Field of study

Ezid

Be aware of the allele-specific bias and compositional effects in multi-template PCR

Author: Anna A. Igolkina
Arina A. Kichko
Evgeny E. Andronov
Ilia Korvigo
Tatiana Aksenova
Publication venue: PeerJ Inc.
Publication date: 01/08/2022
Field of study

High-throughput sequencing of amplicon libraries is the most widespread and one of the most effective ways to study the taxonomic structure of microbial communities, even despite growing accessibility of whole metagenome sequencing. Due to the targeted amplification, the method provides unparalleled resolution of communities, but at the same time perturbs initial community structure thereby reducing data robustness and compromising downstream analyses. Experimental research of the perturbations is largely limited to comparative studies on different PCR protocols without considering other sources of experimental variation related to characteristics of the initial microbial composition itself. Here we analyse these sources and demonstrate how dramatically they effect the relative abundances of taxa during the PCR cycles. We developed the mathematical model of the PCR amplification assuming the heterogeneity of amplification efficiencies and considering the compositional nature of data. We designed the experiment—five consecutive amplicon cycles (22–26) with 12 replicates for one real human stool microbial sample—and estimated the dynamics of the microbial community in line with the model. We found the high heterogeneity in amplicon efficiencies of taxa that leads to the non-linear and substantial (up to fivefold) changes in relative abundances during PCR. The analysis of possible sources of heterogeneity revealed the significant association between amplicon efficiencies and the energy of secondary structures of the DNA templates. The result of our work highlights non-trivial changes in the dynamics of real-life microbial communities due to their compositional nature. Obtained effects are specific not only for amplicon libraries, but also for any studies of metagenome dynamics

Directory of Open Access Journals

PubMed Central

Recommended from our members

Historical Routes for Diversification of Domesticated Chickpea Inferred from Landrace Genomics

Author: Igolkina Anna A
Longcore Travis
Noujdina Nina V
Nuzhdin Sergey V
Samsonova Maria G
Vishnyakova Margarita
von Wettberg Eric
Publication venue: eScholarship, University of California
Publication date: 01/06/2023
Field of study

According to archaeological records, chickpea (Cicer arietinum) was first domesticated in the Fertile Crescent about 10,000 years BP. Its subsequent diversification in Middle East, South Asia, Ethiopia, and the Western Mediterranean, however, remains obscure and cannot be resolved using only archeological and historical evidence. Moreover, chickpea has two market types: "desi" and "kabuli," for which the geographic origin is a matter of debate. To decipher chickpea history, we took the genetic data from 421 chickpea landraces unaffected by the green revolution and tested complex historical hypotheses of chickpea migration and admixture on two hierarchical spatial levels: within and between major regions of cultivation. For chickpea migration within regions, we developed popdisp, a Bayesian model of population dispersal from a regional representative center toward the sampling sites that considers geographical proximities between sites. This method confirmed that chickpea spreads within each geographical region along optimal geographical routes rather than by simple diffusion and estimated representative allele frequencies for each region. For chickpea migration between regions, we developed another model, migadmi, that takes allele frequencies of populations and evaluates multiple and nested admixture events. Applying this model to desi populations, we found both Indian and Middle Eastern traces in Ethiopian chickpea, suggesting the presence of a seaway from South Asia to Ethiopia. As for the origin of kabuli chickpeas, we found significant evidence for its origin from Turkey rather than Central Asia

eScholarship - University of California

Analysis of Gene Expression Variance in Schizophrenia Using Structural Equation Modeling

Author: Anna A. Igolkina
Chris Armoskus
Jeremy R. B. Newman
Lauren M. McIntyre
Maria G. Samsonova
Oleg V. Evgrafov
Sergey V. Nuzhdin
Sergey V. Nuzhdin
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2018
Field of study

Schizophrenia (SCZ) is a psychiatric disorder of unknown etiology. There is evidence suggesting that aberrations in neurodevelopment are a significant attribute of schizophrenia pathogenesis and progression. To identify biologically relevant molecular abnormalities affecting neurodevelopment in SCZ we used cultured neural progenitor cells derived from olfactory neuroepithelium (CNON cells). Here, we tested the hypothesis that variance in gene expression differs between individuals from SCZ and control groups. In CNON cells, variance in gene expression was significantly higher in SCZ samples in comparison with control samples. Variance in gene expression was enriched in five molecular pathways: serine biosynthesis, PI3K-Akt, MAPK, neurotrophin and focal adhesion. More than 14% of variance in disease status was explained within the logistic regression model (C-value = 0.70) by predictors accounting for gene expression in 69 genes from these five pathways. Structural equation modeling (SEM) was applied to explore how the structure of these five pathways was altered between SCZ patients and controls. Four out of five pathways showed differences in the estimated relationships among genes: between KRAS and NF1, and KRAS and SOS1 in the MAPK pathway; between PSPH and SHMT2 in serine biosynthesis; between AKT3 and TSC2 in the PI3K-Akt signaling pathway; and between CRK and RAPGEF1 in the focal adhesion pathway. Our analysis provides evidence that variance in gene expression is an important characteristic of SCZ, and SEM is a promising method for uncovering altered relationships between specific genes thus suggesting affected gene regulation associated with the disease. We identified altered gene-gene interactions in pathways enriched for genes with increased variance in expression in SCZ. These pathways and loci were previously implicated in SCZ, providing further support for the hypothesis that gene expression variance plays important role in the etiology of SCZ

Directory of Open Access Journals

Frontiers - Publisher Connector

H3K4me3, H3K9ac, H3K27ac, H3K27me3 and H3K9me3 Histone Tags Suggest Distinct Regulatory Evolution of Open and Condensed Chromatin Landmarks

Author: Aleksey A. Popov
Anna A. Igolkina
Anton Buzdin
Arsenii Zinkevich
Daniil M. Nikitin
Daria Nikolaeva
Dmitry Penzar
Kristina O. Karandasheva
Maria V. Selifanova
Victor Tkachev
Publication venue: 'MDPI AG'
Publication date: 05/09/2019
Field of study

Background: Transposons are selfish genetic elements that self-reproduce in host DNA. They were active during evolutionary history and now occupy almost half of mammalian genomes. Close insertions of transposons reshaped structure and regulation of many genes considerably. Co-evolution of transposons and host DNA frequently results in the formation of new regulatory regions. Previously we published a concept that the proportion of functional features held by transposons positively correlates with the rate of regulatory evolution of the respective genes. Methods: We ranked human genes and molecular pathways according to their regulatory evolution rates based on high throughput genome-wide data on five histone modifications (H3K4me3, H3K9ac, H3K27ac, H3K27me3, H3K9me3) linked with transposons for five human cell lines. Results: Based on the total of approximately 1.5 million histone tags, we ranked regulatory evolution rates for 25075 human genes and 3121 molecular pathways and identified groups of molecular processes that showed signs of either fast or slow regulatory evolution. However, histone tags showed different regulatory patterns and formed two distinct clusters: promoter/active chromatin tags (H3K4me3, H3K9ac, H3K27ac) vs. heterochromatin tags (H3K27me3, H3K9me3). Conclusion: In humans, transposon-linked histone marks evolved in a coordinated way depending on their functional roles

Multidisciplinary Digital Publishing Institute

Heterogeneity of the GFP fitness landscape and data-driven protein design

Author: Alaball Pujol Maria-Elisenda
Bozhanova Nina G
Fleiss Aubin
Gonzalez Somermeyer Louisa
Igolkina Anna A
Kondrashov Fyodor
Meiler Jens
Mishin Alexander S
Putintseva Ekaterina V
Sarkisyan Karen S
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 01/01/2022
Field of study

Studies of protein fitness landscapes reveal biophysical constraints guiding protein evolution and empower prediction of functional proteins. However, generalisation of these findings is limited due to scarceness of systematic data on fitness landscapes of proteins with a defined evolutionary relationship. We characterized the fitness peaks of four orthologous fluorescent proteins with a broad range of sequence divergence. While two of the four studied fitness peaks were sharp, the other two were considerably flatter, being almost entirely free of epistatic interactions. Mutationally robust proteins, characterized by a flat fitness peak, were not optimal templates for machine-learning-driven protein design – instead, predictions were more accurate for fragile proteins with epistatic landscapes. Our work paves insights for practical application of fitness landscape heterogeneity in protein engineering

ZENODO

PubMed Central

IST Austria: PubRep (Institute of Science and Technology)

Towards Understanding Afghanistan Pea Symbiotic Phenotype Through the Molecular Modeling of the Interaction Between LykX-Sym10 Receptor Heterodimer and Nod Factors

Author: Andronov Evgeny E. (author)
Igolkina Anna A. (author)
Kuliaev Pavel O. (author)
Pidko E.A. (author)
Porozov Yuri B. (author)
Solovev Yaroslav V. (author)
Sulima Anton S. (author)
Zhukov Vladimir A. (author)
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2021
Field of study

The difference in symbiotic specificity between peas of Afghanistan and European phenotypes was investigated using molecular modeling. Considering segregating amino acid polymorphism, we examined interactions of pea LykX-Sym10 receptor heterodimers with four forms of Nodulation factor (NF) that varied in natural decorations (acetylation and length of the glucosamine chain). First, we showed the stability of the LykX-Sym10 dimer during molecular dynamics (MD) in solvent and in the presence of a membrane. Then, four NFs were separately docked to one European and two Afghanistan dimers, and the results of these interactions were in line with corresponding pea symbiotic phenotypes. The European variant of the LykX-Sym10 dimer effectively interacts with both acetylated and non-acetylated forms of NF, while the Afghanistan variants successfully interact with the acetylated form only. We additionally demonstrated that the length of the NF glucosamine chain contributes to controlling the effectiveness of the symbiotic interaction. The obtained results support a recent hypothesis that the LykX gene is a suitable candidate for the unidentified Sym2 allele, the determinant of pea specificity toward Rhizobium leguminosarum bv. viciae strains producing NFs with or without an acetylation decoration. The developed modeling methodology demonstrated its power in multiple searches for genetic determinants, when experimental detection of such determinants has proven extremely difficult.ChemE/AlgemeenChemE/Inorganic Systems Engineerin

TU Delft Repository