32 research outputs found

    Comparative Analysis of Prokaryotic Communities Associated with Organic and Conventional Farming Systems

    Get PDF
    One of the most important challenges in agriculture is to determine the effectiveness and environmental impact of certain farming practices. The aim of present study was to determine and compare the taxonomic composition of the microbiomes established in soil following long-term exposure (14 years) to a conventional and organic farming systems (CFS and OFS accordingly). Soil from unclared forest next to the fields was used as a control. The analysis was based on RT-PCR and pyrosequencing of 16S rRNA genes of bacteria and archaea. The number of bacteria was significantly lower in CFS than in OFS and woodland. The highest amount of archaea was detected in woodland, whereas the amounts in CFS and OFS were lower and similar. The most common phyla in the soil microbial communities analyzed were Proteobacteria (57.9%), Acidobacteria (16.1%), Actinobacteria (7.9%), Verrucomicrobia (2.0%), Bacteroidetes (2.7%) and Firmicutes (4.8%). Woodland soil differed from croplands in the taxonomic composition of microbial phyla. Croplands were enriched with Proteobacteria (mainly the genus Pseudomonas), while Acidobacteria were detected almost exclusively in woodland soil. The most pronounced differences between the CFS and OFS microbiomes were found within the genus Pseudomonas, which significantly (p<0,05) increased its number in CFS soil compared to OFS. Other differences in microbiomes of cropping systems concerned minor taxa. A higher relative abundance of bacteria belonging to the families Oxalobacteriaceae, Koribacteriaceae, Nakamurellaceae and genera Ralstonia, Paenibacillus and Pedobacter was found in CFS as compared with OFS. On the other hand, microbiomes of OFS were enriched with proteobacteria of the family Comamonadaceae (genera Hylemonella) and Hyphomicrobiaceae, actinobacteria from the family Micrococcaceae, and bacteria of the genera Geobacter, Methylotenera, Rhizobium (mainly Rhizobium leguminosarum) and Clostridium. Thus, the fields under OFS and CFS did not differ greatly for the composition of the microbiome. These results, which were also confirmed by cluster analysis, indicated that microbial communities in the field soil do not necessarily differ largely between conventional and organic farming systems.Peer reviewe

    Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules

    No full text
    Abstract Chemical named entity recognition (NER) is an active field of research in biomedical natural language processing. To facilitate the development of new and superior chemical NER systems, BioCreative released the CHEMDNER corpus, an extensive dataset of diverse manually annotated chemical entities. Most of the systems trained on the corpus rely on complicated hand-crafted rules or curated databases for data preprocessing, feature extraction and output post-processing, though modern machine learning algorithms, such as deep neural networks, can automatically design the rules with little to none human intervention. Here we explored this approach by experimenting with various deep learning architectures for targeted tokenisation and named entity recognition. Our final model, based on a combination of convolutional and stateful recurrent neural networks with attention-like loops and hybrid word- and character-level embeddings, reaches near human-level performance on the testing dataset with no manually asserted rules. To make our model easily accessible for standalone use and integration in third-party software, we’ve developed a Python package with a minimalistic user interface

    Generalising better: Applying deep learning to integrate deleteriousness prediction scores for whole-exome SNV studies

    No full text
    <div><p>Many automatic classifiers were introduced to aid inference of phenotypical effects of uncategorised nsSNVs (nonsynonymous Single Nucleotide Variations) in theoretical and medical applications. Lately, several meta-estimators have been proposed that combine different predictors, such as PolyPhen and SIFT, to integrate more information in a single score. Although many advances have been made in feature design and machine learning algorithms used, the shortage of high-quality reference data along with the bias towards intensively studied <i>in vitro</i> models call for improved generalisation ability in order to further increase classification accuracy and handle records with insufficient data. Since a meta-estimator basically combines different scoring systems with highly complicated nonlinear relationships, we investigated how deep learning (supervised and unsupervised), which is particularly efficient at discovering hierarchies of features, can improve classification performance. While it is believed that one should only use deep learning for high-dimensional input spaces and other models (logistic regression, support vector machines, Bayesian classifiers, etc) for simpler inputs, we still believe that the ability of neural networks to discover intricate structure in highly heterogenous datasets can aid a meta-estimator. We compare the performance with various popular predictors, many of which are recommended by the American College of Medical Genetics and Genomics (ACMG), as well as available deep learning-based predictors. Thanks to hardware acceleration we were able to use a computationally expensive genetic algorithm to stochastically optimise hyper-parameters over many generations. Overfitting was hindered by noise injection and dropout, limiting coadaptation of hidden units. Although we stress that this work was not conceived as a tool comparison, but rather an exploration of the possibilities of deep learning application in ensemble scores, our results show that even relatively simple modern neural networks can significantly improve both prediction accuracy and coverage. We provide open-access to our finest model via the web-site: <a href="http://score.generesearch.ru/services/badmut/" target="_blank">http://score.generesearch.ru/services/badmut/</a>.</p></div

    Be aware of the allele-specific bias and compositional effects in multi-template PCR

    No full text
    High-throughput sequencing of amplicon libraries is the most widespread and one of the most effective ways to study the taxonomic structure of microbial communities, even despite growing accessibility of whole metagenome sequencing. Due to the targeted amplification, the method provides unparalleled resolution of communities, but at the same time perturbs initial community structure thereby reducing data robustness and compromising downstream analyses. Experimental research of the perturbations is largely limited to comparative studies on different PCR protocols without considering other sources of experimental variation related to characteristics of the initial microbial composition itself. Here we analyse these sources and demonstrate how dramatically they effect the relative abundances of taxa during the PCR cycles. We developed the mathematical model of the PCR amplification assuming the heterogeneity of amplification efficiencies and considering the compositional nature of data. We designed the experiment—five consecutive amplicon cycles (22–26) with 12 replicates for one real human stool microbial sample—and estimated the dynamics of the microbial community in line with the model. We found the high heterogeneity in amplicon efficiencies of taxa that leads to the non-linear and substantial (up to fivefold) changes in relative abundances during PCR. The analysis of possible sources of heterogeneity revealed the significant association between amplicon efficiencies and the energy of secondary structures of the DNA templates. The result of our work highlights non-trivial changes in the dynamics of real-life microbial communities due to their compositional nature. Obtained effects are specific not only for amplicon libraries, but also for any studies of metagenome dynamics

    Network types.

    No full text
    <p>Schematic representation of basic deep learning models used in this study. (a) A multilayer perceptron (MLP). (b) A shallow denoising autoencoder (dAE). (c) Connecting dAEs into a stacked denoising autoencoder (sdAE); notice that each individual dAE learns to reconstruct the latent representation from the previous one (data stream is represented by arrows). Colours encode layer functions (combinations are possible): blue—input, light-red—latent, dark-red—dropout (noise), purple—output, hollow—discarded.</p
    corecore