840 research outputs found
Machine learning classification of microbial community compositions to predict anthropogenic pollutants in the Baltic Sea
Microbial communities react rapidly and specifically to changing environments, indicating distinct microbial fingerprints for a given environmental state. Machine learning with community data predicted the Baltic Sea-detected pollutants glyphosate and 2,4,6-trinitrotoluene, using the developed R package “phyloseq2ML”. Predictions by Random Forest and Artificial Neural Network were accurate. Relevant taxa were identified. The interpretability of machine learning models was found of particular importance. Microbial communities predicted even minor influencing factors in complex environments.Mikrobielle Gemeinschaften reagieren schnell und spezifisch auf sich ändernde Umgebungen und können somit bestimmte Umweltzustände anzeigen. Maschinelles Lernen mit Gemeinschaftsdaten sagte die Ostsee-präsenten Schadstoffe Glyphosat und 2,4,6-Trinitrotoluol voraus, wobei das entwickelte R-Paket "phyloseq2ML" verwendet wurde. Die Vorhersagen durch Random Forest und Artificial Neural Network waren genau. Relevante Taxa wurden identifiziert. Die Interpretierbarkeit der Modelle erwies sich als essentiell. Mikrobielle Gemeinschaften sagten selbst geringe Einflüsse in komplexen Umgebungen voraus
Recommended from our members
Consider the community : developing predictive linkages between community structure and performance in microbial fuel cells
The complex, dynamic nature of microbial communities in both natural and engineered environments complicates the work of scientists and engineers who wish to channel microbial interactions for societal good. The successful management of these communities towards engineering goals is dependent on developing predictive linkages between community structure and functional outputs. The performance of microbial fuel cells (MFCs), an emerging environmental biotechnology, is driven by a diverse microbial community capable of converting the chemical potential energy contained in waste streams to electrical energy. This technology stands to benefit greatly from an increased understanding of the microorganisms contained within as it transitions from the laboratory to practical application. MFCs also offer a controlled environment in which new approaches to developing predictive understandings of microbial communities can be developed.Revolutions in molecular science over the past decade paved the way for the rapid increase in genomic data available for microbial communities from a wide range of environments. Increases in computing power and accessibility over the same period provide a means in which the amassed community data can be mined for potential interactions and linked to functional outcomes. One of the methods through which this can be done is the use of artificial neural networks (ANNs). ANN-based models can be used to generate accurate microbial assemblage predictions across a variety of environments, but have never been applied to the microbial communities of environmental biotechnologies.In the present dissertation, MFC biofilms are analyzed over time, across reactor designs, under varying environmental conditions, and following pH disruption to identify core community membership. Results demonstrated that deterministic interactions shaped consistent community structures characterized by the formation of highly conductive anodic biofilms. The core MFC community is defined by a high abundance of anode-respiring Geobacter sulfurreducens. and biomass fermenting Aminiphilus circumscriptus along with other syntrophic bacteria. Community structure shifted into repeatable formations following the introduction of various substrates and wastewaters. Under changing conditions reactor performance in terms of power generation, treatment rates, and coulombic efficiencies was repeatable and linked to community composition using ANN models. ANN models that incorporated community predictions performed significantly better than those solely based on environmental parameters and predicted all performance metrics within 6% providing the first evidence for the value of including community data into ANN-based MFC models. Community composition could also be linked to biofilm stability following exposure to low pH solutions. Through the first quantitative evaluations of biofilmresilience in MFCs a correlation between the relative abundance of Geobacteraceae and process stability was observed, however, ANN models that considered relative abundance of other bacteria predicted stability more accurately. Further development of these models can be used in practical settings to determine and avoid risk of deactivation during operation.This dissertation characterizes a single MFC community over a variety of conditions and represents the first attempt to use machine-learning based approaches to connect community structure to performance in environmental biotechnology applications. The further development of these and other similar artificial intelligence data-mining tools will improve the management of microbial communities that drive environmental biotechnologies like MFCs and spur them towards practical application. Strengthening linkages between community, structure, interactions, and function in these technologies may be applied across industries, inspiring new applications and innovations involving microbial communities.Keywords: environmental biotechnology, microbial fuel cells, artificial neural networks, biofilms, microbial communit
Isochronous rhythmic organization of learned animal vocalizations
The evolutionary path that led to music as we know it today is difficult to trace. Cross-species comparative research can help us uncover the biological substrates that enabled humans to develop this peculiar behavior. Rhythm, the organization of events in time, is a central component in the structure of all forms of music. Oftentimes musical rhythm gives rise to a perceptionally isochronous beat, or pulse. Learned vocalizations of non-human animals, such as birdsong and the songs of certain bat species, show striking parallels to vocal music (i.e. human song). This thesis investigates these vocalizations for the presence of an isochronous rhythmic structure that could allow a conspecific listener to perceive such a beat. To this end, I have developed a generate-and-test (GAT) method to extract an isochronous pulse from a temporal sequence of events, such as the onsets of notes. This method is compared to a variety of existing analytic techniques for analyzing different aspects of rhythms in vocalizations, movements and other behaviors developing over time. The suitability of the different methods for addressing particular questions is illustrated through various examples. The application of the GAT approach to different types of vocalizations of the greater sac-winged bat (Saccopteryx bilineata) revealed a common temporal regularity that might point towards an interesting relationship between physiologically determined rhythm and the rhythm of learned social vocalizations. In the songs of zebra finches (Taeniopygia guttata) we discovered a hierarchical isochronous structure that is reminiscent of the metrical structure of many types of music. We then report the effect of genetic manipulations on the song learning success of zebra finches. The expression of FoxP2, a gene involved in speech acquisition and birdsong learning, as well as of two related genes, FoxP1 and FoxP4, was experimentally reduced in juvenile birds during their learning period. Among other effects, the adult birds produced song with an impaired isochronous structure. Surprisingly, control animals whose FoxP levels were not reduced, showed a similar effect in this regard. I discuss possible interpretations of this result in the light of current knowledge about neural mechanisms and behavioral processes of song learning and production
The molecular underpinnings of neuronal cell identity in the stomatogastric ganglion of cancer borealis
Throughout the life of an organism, the nervous system must be able to balance changing in response to environmental stimuli with the need to produce reliable, repeatable activity patterns to create stereotyped behaviors. Understanding the mechanisms responsible for this regulation requires a wealth of knowledge about the neural system, ranging from network connectivity and cell type identification to intrinsic neuronal excitability and transcriptomic expression. To make strides in this area, we have employed the well-described stomatogastric nervous system of the Jonah crab Cancer borealis to examine the molecular underpinnings and regulation of neuron cell identity. Several crustacean circuits, including the stomatogastric nervous system and the cardiac ganglion, continue to provide important new insights into circuit dynamics and modulation (Diehl, White, Stein, & Nusbaum, 2013; Marder, 2012; Marder & Bucher, 2007; Williams et al., 2013), but this work has been partially hampered by the lack of extensive molecular sequence knowledge in crustaceans. Here we generated de novo transcriptome assembly from central nervous system tissue for C. borealis producing 42,766 contigs, focusing on an initial identification, curation, and comparison of genes that will have the most profound impact on our understanding of circuit function in these species. This included genes for 34 distinct ion channel types, 17 biogenic amine and 5 GABA receptors, 28 major transmitter receptor subtypes including glutamate and acetylcholine receptors, and 6 gap junction proteins -- the Innexins. ... With this reference transcriptome and annotated sequences in hand, we sought to determine the strengths and limitations of using the neuronal molecular profile to classify them into cell types. ... Since the resulting activity of a neuron is the product of the expression of ion channel genes, we sought to further probe the expression profile of neurons across a range of cell types to understand how these patterns of mRNA abundance relate to the properties of individual cell types. ... Finally, we sought to better understand the molecular underpinnings of how these correlated patterns of mRNA expression are generated and maintained.Includes bibliographical reference
Variability of microbial taxonomic and functional diversities across management boundaries in a boreal podzol
Land capability classification describes boreal podzols as soils with severe to moderately severe
limitations that restrict the capability of the land to produce crops. Nevertheless, they are used
for crop production and it is predicted that more boreal podzols will be converted from forestry
use to agricultural uses. This usually requires intensive conservation and fertility improvement
practices aimed at correcting the excessively low pH and improving soil carbon parameters.
Under such management, it is expected that the biotic parameters and drivers of soil fertility
would be drastically affected. It is hypothesized that mass and energy fluxes across the edge of a
cropped field, between natural and managed conditions of soil, will alter the diversity of
microbial populations and their fertility relevant functions.
To verify this, I surveyed a cropped field and its immediate surrounding areas, located within a
Boreal Forest Ecosystem in Western Newfoundland. The surrounding areas, outside the four
field edges covered four distinct non-cropped conditions, i.e. forested, wetland, grassland and
grassed farm road border. Bacterial taxonomic diversity was assessed via a 16S rRNA obtained
through an Illumina MiSeq PE 250bp amplicon sequencing of the V4 hypervariable region.
Fungal taxonomic diversity was assessed on an ITS dataset obtained through an Illumina MiSeq
PE 250bp amplicon sequencing of the ITS1-2 region. A predictive functional profiling of the
bacterial community, based on the 16S rRNA results (PICRUSt) was then carried out. Results are
contextualized by standard abiotic soil parameters and compared to potential nitrogen mineralization rates
along a management intensity gradient, i.e. a gradient crossing from natural to cropped conditions. Both
surface and subsurface layers were considered. Standard and exploratory statistics were carried out and
included an analysis of ecological indicators for population diversity. Statistical analysis was carried out
separately on soil physicochemical properties, microbial taxonomic diversity, and microbial functional
diversity. Correlational analyses between microbial diversity and physicochemical properties and were
carried out separately. It was found that, while the natural conditions tested had distinct diversities, the
results became increasingly similar towards the field centre, away from the natural edge. Thus, land
management affects the taxonomic and functional diversity of microorganisms and also found
that the shift in taxonomic and functional diversity is directly related to the distance from the
natural areas
Rapid detection of microbiota cell type diversity using machine-learned classification of flow cytometry data.
The study of complex microbial communities typically entails high-throughput sequencing and downstream bioinformatics analyses. Here we expand and accelerate microbiota analysis by enabling cell type diversity quantification from multidimensional flow cytometry data using a supervised machine learning algorithm of standard cell type recognition (CellCognize). As a proof-of-concept, we trained neural networks with 32 microbial cell and bead standards. The resulting classifiers were extensively validated in silico on known microbiota, showing on average 80% prediction accuracy. Furthermore, the classifiers could detect shifts in microbial communities of unknown composition upon chemical amendment, comparable to results from 16S-rRNA-amplicon analysis. CellCognize was also able to quantify population growth and estimate total community biomass productivity, providing estimates similar to those from <sup>14</sup> C-substrate incorporation. CellCognize complements current sequencing-based methods by enabling rapid routine cell diversity analysis. The pipeline is suitable to optimize cell recognition for recurring microbiota types, such as in human health or engineered systems
Recommended from our members
Collective analysis of multiple high-throughput gene expression datasets
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University LondonModern technologies have resulted in the production of numerous high-throughput biological datasets. However, the pace of development of capable computational methods does not cope with the pace of generation of new high-throughput datasets. Amongst the most popular biological high-throughput datasets are gene expression datasets (e.g. microarray datasets). This work targets this aspect by proposing a suite of computational methods which can analyse multiple gene expression datasets collectively. The focal method in this suite is the unification of clustering results from multiple datasets using external specifications (UNCLES). This method applies clustering to multiple heterogeneous datasets which measure the expression of the same set of genes separately and then combines the resulting partitions in accordance to one of two types of external specifications; type A identifies the subsets of genes that are consistently co-expressed in all of the given datasets while type B identifies the subsets of genes that are consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets. This contributes to the types of questions which can addressed by computational methods because existing clustering, consensus clustering, and biclustering methods are inapplicable to address the aforementioned objectives. Moreover, in order to assist in setting some of the parameters required by UNCLES, the M-N scatter plots technique is proposed. These methods, and less mature versions of them, have been validated and applied to numerous real datasets from the biological contexts of budding yeast, bacteria, human red blood cells, and malaria. While collaborating with biologists, these applications have led to various biological insights. In yeast, the role of the poorly-understood gene CMR1 in the yeast cell-cycle has been further elucidated. Also, a novel subset of poorly understood yeast genes has been discovered with an expression profile consistently negatively correlated with the well-known ribosome biogenesis genes. Bacterial data analysis has identified two clusters of negatively correlated genes. Analysis of data from human red blood cells has produced some hypotheses regarding the regulation of the pathways producing such cells. On the other hand, malarial data analysis is still at a preliminary stage. Taken together, this thesis provides an original integrative suite of computational methods which scrutinise multiple gene expression datasets collectively to address previously unresolved questions, and provides the results and findings of many applications of these methods to real biological datasets from multiple contexts.National Institute for Health Research (NIHR) and the Brunel College of Engineering, Design and Physical Science
- …