Search CORE

159 research outputs found

Events Recognition System for Water Treatment Works

Author: Riss G
Publication venue: College of Engineering, Mathematics and Physical Sciences
Publication date: 11/05/2020
Field of study

The supply of drinking water in sufficient quantity and required quality is a challenging task for water companies. Tackling this task successfully depends largely on ensuring a continuous high quality level of water treatment at Water Treatment Works (WTW). Therefore, processes at WTWs are highly automated and controlled. A reliable and rapid detection of faulty sensor data and failure events at WTWs processes is of prime importance for its efficient and effective operation. Therefore, the vast majority of WTWs operated in the UK make use of event detection systems that automatically generate alarms after the detection of abnormal behaviour on observed signals to ensure an early detection of WTW’s process failures. Event detection systems usually deployed at WTWs apply thresholds to the monitored signals for the recognition of WTW’s faulty processes. The research work described in this thesis investigates new methods for near real-time event detection at WTWs by the implementation of statistical process control and machine learning techniques applied for an automated near real-time recognition of failure events at WTWs processes. The resulting novel Hybrid CUSUM Event Recognition System (HC-ERS) makes use of new online sensor data validation and pre-processing techniques and utilises two distinct detection methodologies: first for fault detection on individual signals and second for the recognition of faulty processes and events at WTWs. The fault detection methodology automatically detects abnormal behaviour of observed water quality parameters in near real-time using the data of the corresponding sensors that is online validated and pre-processed. The methodology utilises CUSUM control charts to predict the presence of faults by tracking the variation of each signal individually to identify abnormal shifts in its mean. The basic CUSUM methodology was refined by investigating optimised interdependent parameters for each signal individually. The combined predictions of CUSUM fault detection on individual signals serves the basis for application of the second event detection methodology. The second event detection methodology automatically identifies faults at WTW’s processes respectively failure events at WTWs in near real-time, utilising the faults detected by CUSUM fault detection on individual signals beforehand. The method applies Random Forest classifiers to predict the presence of an event at WTW’s processes. All methods have been developed to be generic and generalising well across different drinking water treatment processes at WTWs. HC-ERS has proved to be effective in the detection of failure events at WTWs demonstrated by the application on real data of water quality signals with historical events from a UK’s WTWs. The methodology achieved a peak F1 value of 0.84 and generates 0.3 false alarms per week. These results demonstrate the ability of method to automatically and reliably detect failure events at WTW’s processes in near real-time and also show promise for practical application of the HC-ERS in industry. The combination of both methodologies presents a unique contribution to the field of near real-time event detection at WTW

Open Research Exeter

Vol. 11, No. 2 (Full Issue)

Author: Editors JMASM
Publication venue: DigitalCommons@WayneState
Publication date: 01/11/2012
Field of study

Digital Commons@Wayne State University

Small-Sample Analysis and Inference of Networked Dependency Structures from Complex Genomic Data

Author: Schaefer Juliane
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 16/03/2006
Field of study

Die vorliegende Arbeit beschäftigt sich mit der statistischen Modellierung und Inferenz genetischer Netzwerke. Assoziationsstrukturen und wechselseitige Einflüsse sind ein wichtiges Thema in der Systembiologie. Genexpressionsdaten weisen eine hohe Dimensionalität auf, die geringen Stichprobenumfängen gegenübersteht ("small n, large p"). Die Analyse von Interaktionsstrukturen mit Hilfe graphischer Modelle ist demnach ein schlecht gestelltes (inverses) Problem, dessen Lösung Methoden zur Regularisierung erfordert. Ich schlage neuartige Schätzfunktionen für Kovarianzstrukturen und (partielle) Korrelationen vor. Diese basieren entweder auf Resampling-Verfahren oder auf Shrinkage zur Varianzreduktion. In der letzteren Methode wird die optimale Shrinkage Intensität analytisch berechnet. Im Vergleich zur klassischen Stichprobenkovarianzmatrix besitzt speziell diese Schätzfunktion wünschenswerte Eigenschaften im Sinne von gesteigerter Effizienz und von kleinerem mittleren quadratischen Fehler. Außerdem ergeben sich stets positiv definite und gut konditionierte Parameterschätzungen. Zur Bestimmung der Netzwerktopologie wird auf das Konzept graphischer Gaußscher Modelle zurückgegriffen, mit deren Hilfe sich sowohl marginale als auch bedingte Unabhängigkeiten darstellen lassen. Es wird eine Methode zur Modellselektion vorgestellt, die auf einer multiplen Testprozedur mit Kontrolle der False Discovery Rate beruht. Dabei wird die zugrunde liegende Nullverteilung adaptiv geschätzt. Das vorgeschlagene Framework ist rechentechnisch effizient und schneidet im Vergleich mit konkurrierenden Verfahren sowohl in Simulationen als auch in der Anwendung auf molekulare Daten sehr gut ab

Digitale Hochschulschriften der LMU

Bi-Directional Testing for Change Point Detection in Poisson Processes

Author: Bhaduri Moinak
Publication venue: Digital Scholarship@UNLV
Publication date: 15/05/2018
Field of study

Point processes often serve as a natural language to chronicle an event\u27s temporal evolution, and significant changes in the flow, synonymous with non-stationarity, are usually triggered by assignable and frequently preventable causes, often heralding devastating ramifications. Examples include amplified restlessness of a volcano, increased frequencies of airplane crashes, hurricanes, mining mishaps, among others. Guessing these time points of changes, therefore, merits utmost care. Switching the way time traditionally propagates, we posit a new genre of bidirectional tests which, despite a frugal construct, prove to be exceedingly efficient in culling out non-stationarity under a wide spectrum of environments. A journey surveying a lavish class of intensities, ranging from the tralatitious power laws to the deucedly germane rough steps, tracks the established unidirectional forward and backward test\u27s evolution into a p-value induced dual bidirectional test, the best member of the proffered category. Niched within a hospitable Poissonian framework, this dissertation, through a prudent harnessing of the bidirectional category\u27s classification prowess, incites a refreshing alternative to estimating changes plaguing a soporific flow, by conducting a sequence of tests. Validation tools, predominantly graphical, rid the structure of forbidding technicalities, aggrandizing the swath of applicability. Extensive simulations, conducted especially under hostile premises of hard non-stationarity detection, document minimal estimation error and reveal the algorithm\u27s obstinate versatility at its most unerring

University of Nevada, Las Vegas Repository

Identification of genetic drivers of colorectal cancer via bioinformatics and machine learning

Author: Camacho João Pedro Marques
Publication venue
Publication date: 01/12/2022
Field of study

Machine learning methods have been widely used in a range of areas within genetics and genomics, it is maybe one of the most useful tools for the interpretation of large genomic data sets and has been used to annotate and analyse a wide variety of genomic sequence elements due to its ability to analyze and learn how to extract data insights from large heterogeneous data sets. In this work, we mainly focus on identifying gene markers that are associated with an increased risk of colorectal cancer (CRC) one of the most common cancers worldwide, showing the highest mortality. In this research, we look into feature selection methods based on variant relevancy toward the development of hereditary diseases. With this approach, we aim to find rel- evant frequently occurring variants and also rare variant occurrences, this way we will identify potentially valuable disease biomarkers. We analysed 8339 different variants and determined 765 to be relevant to CRC. We will also use feature clustering methods for the identification of co-occurrence between certain genetic variants, this will allow us to identify genetic links and non-co-occurring variants that are both rare and associated with an increased risk of development of CRC. Using this method we can determine differ- ent co-occurring variant groups with an additional one being composed of independent variants. We expect the identification of these gene markers to allow for better clinical manage- ment of the patients, namely due to the identification of genetic predispositions to CRC that will allow for a better risk assessment of patients and change the type of exams to be performed and their frequency, which will have a strong impact not only on their clinical screening but also on that of their family members, this can allow for early identification of tumours or even benign lesions, therefore contributing to CRC prevention. We believe that this study will contribute to the overall understanding of CRC causes and will further advance the study of its prevention. We also expect to give insights on how to identify the biological mechanisms underlying gene variant occurrences for not only CRC but also other hereditary cancer syndromes.Métodos de aprendizagem automática têm sido amplamente utilizados em diversas áreas dentro da genética e genômica. A aprendizagem automática é talvez uma das ferramentas mais úteis para a interpretação de grandes conjuntos de dados genômicos e tem sido usado para anotar e analisar uma ampla variedade de elementos de sequências genô- micas. A sua capacidade para analisar e aprender a extraindo informação de grandes conjuntos de dados heterogéneos. Vamos nos concentrar principalmente na identificação de marcadores genéticos que estão associados a um risco aumentado de cancro colo-retal (CCR), um dos cancros mais comuns em todo o mundo, apresentando uma das maiores mortalidades. Neste estudo, analisamos os métodos de feature selection com base na relevância da variante genética para o desenvolvimento de CCR. Com estes métodos, pretendemos en- contrar variantes relevantes que ocorrem com frequência e também variantes raras, desta forma identificaremos biomarcadores potencialmente valiosos. Analisamos 8339 varian- tes diferentes e determinamos que 765 são relevantes para o desenvolvimento de CCR. Também usaremos métodos de clustering de variantes genéticas para a identificação de correlação entre certas variantes genéticas, o que nos permitirá identificar ligações genéti- cas e ocorrências de variantes independentes que estão associadas a um risco aumentado de desenvolvimento de CCR. Usando esse método, determinamos que há 4 diferentes gru- pos de variantes relevantes, sendo um adicional composto por variantes independentes. Esperamos que a identificação destes marcadores genéticos permita uma melhor ges- tão clínica dos doentes, nomeadamente devido à identificação de predisposições genéticas para CCR que permitirão uma melhor avaliação do risco dos doentes e alterar o tipo de exames a serem realizados e a sua frequência, que terá forte impacto não só na sua triagem clínica, mas também na dos seus familiares, isto pode permitir a identificação precoce de tumores ou mesmo lesões benignas, contribuindo assim para a prevenção de CCR. Acreditamos que este estudo contribuirá para a compreensão geral das causas CCR e avançará o estudo da sua prevenção. Também esperamos fornecer métodos de como identificar os mecanismos biológicos subjacentes às ocorrências de variantes genéticas não apenas para CCR, mas também para outras síndromes de câncer hereditário

Repositório da Universidade Nova de Lisboa

From tools and databases to clinically relevant applications in miRNA research

Author: Fehlmann Tobias
Publication venue: Saarländische Universitäts- und Landesbibliothek
Publication date: 01/01/2021
Field of study

While especially early research focused on the small portion of the human genome that encodes proteins, it became apparent that molecules responsible for many key functions were also encoded in the remaining regions. Originally, non-coding RNAs, i.e., molecules that are not translated into proteins, were thought to be composed of only two classes (ribosomal RNAs and transfer RNAs). However, starting from the early 1980s many other non-coding RNA classes were discovered. In the past two decades, small non-coding RNAs (sncRNAs) and in particular microRNAs (miRNAs), have become essential molecules in biological and biomedical research. In this thesis, five aspects of miRNA research have been addressed. Starting from the development of advanced computational software to analyze miRNA data (1), an in-depth understanding of human and non-human miRNAs was generated and databases hosting this knowledge were created (2). In addition, the effects of technological advances were evaluated (3). We also contributed to the understanding on how miRNAs act in an orchestrated manner to target human genes (4). Finally, based on the insights gained from the tools and resources of the mentioned aspects we evaluated the suitability of miRNAs as biomarkers (5). With the establishment of next-generation sequencing, the primary goal of this thesis was the creation of an advanced bioinformatics analysis pipeline for high-throughput miRNA sequencing data, primarily focused on human. Consequently, miRMaster, a web-based software solution to analyze hundreds sequencing samples within few hours was implemented. The tool was implemented in a way that it could support different sequencing technologies and library preparation techniques. This flexibility allowed miRMaster to build a consequent user-base, resulting in over 120,000 processed samples and 1,5 billion processed reads, as of July 2021, and therefore laid out the basis for the second goal of this thesis. Indeed, the implementation of a feature allowing users to share their uploaded data contributed strongly to the generation of a detailed annotation of the human small non-coding transcriptome. This annotation was integrated into a new miRNA database, miRCarta, modelling thousands of miRNA candidates and corresponding read expression profiles. A subset of these candidates was then evaluated in the context of different diseases and validated. The thereby gained knowledge was subsequently used to validate additional miRNA candidates and to generate an estimate of the number of miRNAs in human. The large collection of samples, gathered over many years with miRMaster was also integrated into a web server evaluating miRNA arm shifts and switches, miRSwitch. Finally, we published an updated version of miRMaster, expanding its scope to other species and adding additional downstream analysis capabilities. The second goal of this thesis was further pursued by investigating the distribution of miRNAs across different human tissues and body fluids, as well as the variability of miRNA profiles over the four seasons of the year. Furthermore, small non-coding RNAs in zoo animals were examined and a tissue atlas of small non-coding RNAs for mice was generated. The third goal, the assessment of technological advances, was addressed by evaluating the new combinatorial probe-anchor synthesis-based sequencing technology published by BGI, analyzing the effect of RNA integrity on sequencing data, analyzing low-input library preparation protocols, and comparing template-switch based library preparation protocols to ligation-based ones. In addition, an antibody-based labeling sequencing chemistry, CoolMPS, was investigated. Deriving an understanding of the orchestrated regulation by miRNAs, the fourth goal of this thesis, was pursued in a first step by the implementation of a web server visualizing miRNA-gene interaction networks, miRTargetLink. Subsequently, miRPathDB, a database incorporating pathways affected by miRNAs and their targets was implemented, as well as miEAA 2.0, a web server offering quick miRNA set enrichment analyses in over 130,000 categories spanning 10 different species. In addition, miRSNPdb, a database evaluating the effects of single nucleotide polymorphisms and variants in miRNAs or in their target genes was created. Finally, the fifth goal of the thesis, the evaluation of the suitability of miRNAs as biomarkers for human diseases was tackled by investigating the expression profiles of miRNAs with machine learning. An Alzheimer's disease cohort with over 400 individuals was analyzed, as well as another neurodegenerative disease cohort with multiple time points of Parkinson's disease patients and healthy controls. Furthermore, a lung cancer cohort covering 3,000 individuals was examined to evaluate the suitability of an early detection test. In addition, we evaluated the expression profile changes induced by aging on a cohort of 1,334 healthy individuals and over 3,000 diseased patients. Altogether, the herein described tools, databases and research papers present valuable advances and insights into the miRNA research field and have been used and cited by the research community over 2,000 times as of July 2021.Während insbesondere die frühe Genetik-Forschung sich auf den kleinen Teil des menschlichen Genoms konzentrierte, der für Proteine kodiert, wurde deutlich, dass auch in den übrigen Regionen Moleküle kodiert werden, die für viele wichtige Funktionen verantwortlich sind. Ursprünglich ging man davon aus, dass nicht codierende RNAs, d. h. Moleküle, die nicht in Proteine übersetzt werden, nur aus zwei Klassen bestehen (ribosomale RNAs und Transfer-RNAs). Seit den frühen 1980er Jahren wurden jedoch viele andere nicht-kodierende RNA-Klassen entdeckt. In den letzten zwei Jahrzehnten sind kleine nichtcodierende RNAs (sncRNAs) und insbesondere microRNAs (miRNAs) zu wichtigen Molekülen in der biologischen und biomedizinischen Forschung geworden. In dieser Arbeit werden fünf Aspekte der miRNA-Forschung behandelt. Ausgehend von der Entwicklung fortschrittlicher Computersoftware zur Analyse von miRNA-Daten (1) wurde ein tiefgreifendes Verständnis menschlicher und nicht-menschlicher miRNAs entwickelt und Datenbanken mit diesem Wissen erstellt (2). Darüber hinaus wurden die Auswirkungen des technologischen Fortschritts bewertet (3). Wir haben auch dazu beigetragen, zu verstehen, wie miRNAs koordiniert agieren, um menschliche Gene zu regulieren (4). Schließlich bewerteten wir anhand der Erkenntnisse, die wir mit den Tools und Ressourcen der genannten Aspekte gewonnen hatten, die Eignung von miRNAs als Biomarker (5). Mit der Etablierung der Sequenzierung der nächsten Generation war das primäre Ziel dieser Arbeit die Schaffung einer fortschrittlichen bioinformatischen Analysepipeline für Hochdurchsatz-MiRNA-Sequenzierungsdaten, die sich in erster Linie auf den Menschen konzentriert. Daher wurde miRMaster, eine webbasierte Softwarelösung zur Analyse von Hunderten von Sequenzierproben innerhalb weniger Stunden, implementiert. Das Tool wurde so implementiert, dass es verschiedene Sequenzierungstechnologien und Bibliotheksvorbereitungstechniken unterstützen kann. Diese Flexibilität ermöglichte es miRMaster, eine konsequente Nutzerbasis aufzubauen, die im Juli 2021 über 120.000 verarbeitete Proben und 1,5 Milliarden verarbeitete Reads umfasste, womit die Grundlage für das zweite Ziel dieser Arbeit geschaffen wurde. Die Implementierung einer Funktion, die es den Nutzern ermöglicht, ihre hochgeladenen Daten mit anderen zu teilen, trug wesentlich zur Erstellung einer detaillierten Annotation des menschlichen kleinen nicht-kodierenden Transkriptoms bei. Diese Annotation wurde in eine neue miRNA-Datenbank, miRCarta, integriert, die Tausende von miRNA-Kandidaten und entsprechende Expressionsprofile abbildet. Eine Teilmenge dieser Kandidaten wurde dann im Zusammenhang mit verschiedenen Krankheiten bewertet und validiert. Die so gewonnenen Erkenntnisse wurden anschließend genutzt, um weitere miRNA-Kandidaten zu validieren und eine Schätzung der Anzahl der miRNAs im Menschen vorzunehmen. Die große Sammlung von Proben, die über viele Jahre mit miRMaster gesammelt wurde, wurde auch in einen Webserver integriert, der miRNA-Armverschiebungen und -Wechsel auswertet, miRSwitch. Schließlich haben wir eine aktualisierte Version von miRMaster veröffentlicht, die den Anwendungsbereich auf andere Spezies ausweitet und zusätzliche Downstream-Analysefunktionen hinzufügt. Das zweite Ziel dieser Arbeit wurde weiterverfolgt, indem die Verteilung von miRNAs in verschiedenen menschlichen Geweben und Körperflüssigkeiten sowie die Variabilität der miRNA-Profile über die vier Jahreszeiten hinweg untersucht wurde. Darüber hinaus wurden kleine nichtkodierende RNAs in Zootieren untersucht und ein Gewebeatlas der kleinen nichtkodierenden RNAs für Mäuse erstellt. Das dritte Ziel, die Einschätzung des technologischen Fortschritts, wurde angegangen, indem die neue kombinatorische Sonden-Anker-Synthese-basierte Sequenzierungstechnologie, die vom BGI veröffentlicht wurde, bewertet wurde, die Auswirkungen der RNA-Integrität auf die Sequenzierungsdaten analysiert wurden, Protokolle für die Bibliotheksvorbereitung mit geringem Input analysiert wurden und Protokolle für die Bibliotheksvorbereitung auf der Basis von Template-Switch mit solchen auf Ligationsbasis verglichen wurden. Darüber hinaus wurde eine auf Antikörpern basierende Labeling-Sequenzierungschemie, CoolMPS, untersucht. Das vierte Ziel dieser Arbeit, das Verständnis der orchestrierten Regulation durch miRNAs, wurde in einem ersten Schritt durch die Implementierung eines Webservers zur Visualisierung von miRNA-Gen-Interaktionsnetzwerken, miRTargetLink, verfolgt. Anschließend wurde miRPathDB implementiert, eine Datenbank, die von miRNAs und ihren Zielgenen beeinflusste Pfade enthält, sowie miEAA 2.0, ein Webserver, der schnelle miRNA-Anreicherungsanalysen in über 130.000 Kategorien aus 10 verschiedenen Spezies bietet. Darüber hinaus wurde miRSNPdb, eine Datenbank zur Bewertung der Auswirkungen von Einzelnukleotid-Polymorphismen und Varianten in miRNAs oder ihren Zielgenen, erstellt. Schließlich wurde das fünfte Ziel der Arbeit, die Bewertung der Eignung von miRNAs als Biomarker für menschliche Krankheiten, durch die Untersuchung der Expressionsprofile von miRNAs anhand von maschinellem Lernen angegangen. Eine Alzheimer-Kohorte mit über 400 Personen wurde analysiert, ebenso wie eine weitere neurodegenerative Krankheitskohorte mit Parkinson-Patienten an mehreren Zeitpunkten der Krankheit und gesunden Kontrollen. Außerdem wurde eine Lungenkrebskohorte mit 3.000 Personen untersucht, um die Eignung eines Früherkennungstests zu bewerten. Darüber hinaus haben wir die altersbedingten Veränderungen des Expressionsprofils bei einer Kohorte von 1.334 gesunden Personen und über 3.000 kranken Patienten untersucht. Insgesamt stellen die hier beschriebenen Tools, Datenbanken und Forschungsarbeiten wertvolle Fortschritte und Erkenntnisse auf dem Gebiet der miRNA-Forschung dar und wurden bis Juli 2021 von der Forschungsgemeinschaft über 2.000 Mal verwendet und zitiert

Universaar

Acronym

Quantitative Analysis of Proteome Dynamics in Chinese Hamster Ovary cells

Author: Florczak Beata D
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/06/2019
Field of study

The overall goal of this research was to better understand the mechanisms underlying the physiology of CHO cells, the most important mammalian host for recombinant protein production. The publication of complete genome of CHO cells allowed the use of mass-spectrometry based proteomic tools to study protein expression. Among several different sample preparation methods for mass spectrometry, in-gel trypsin digest and FASP were found to be the most robust and optimal for high-coverage CHO proteome analysis. Global changes in protein expression between exponential and stationary phases were determined using SILAC for parental GS K-O and producing E22 cell lines. >4000 proteins have been quantified and more than 100 proteins have been statistically differentiated. Proteins up-regulated in exponential phase control cell cycle and DNA replication, while proteins up-regulated in the stationary phase are involved in stress response and signalling, making them interesting targets for cellular engineering. In addition to quantifying relative changes in protein expression between two phases of cell culture, more than 4000 protein copy numbers were calculated for parental and producing cell lines using TPA method. Protein turnover, described as the balance between protein synthesis and degradation, was calculated for >3000 cellular proteins. Combining these two parameters together allowed determination of top 10 proteins corresponding to 20% of global turnover rate. Production of monoclonal antibody was top priority, causing metabolic burden on cells. KEGG and GO annotation suggests that 600 up-regulated proteins in E22 producing cell line explained their clonal selection based on highest growth and productivity. Interestingly, there was no major differences found between amino acid and codon usage between parental and producing cell lines. In summary, a large-scale proteomic data set containing qualitative, quantitative and dynamic information on protein expression for industrially relevant CHO cell lines

White Rose E-theses Online

Diving into the depth of primary motor cortex: a high-resolution investigation of the motor system using 7Tesla fMRI

Author: Amado Catarina Pereira
Publication venue: Faculdade de Ciências e Tecnologia
Publication date: 01/01/2014
Field of study

Dissertação para a obtenção do Grau de Mestre em Engenharia BiomédicaHuman behaviour is grounded in our ability to perform complex tasks. While human motor function has been studied for over a century the cortical processes underlying motor behaviour are still under debate. Central to the execution of action is the primary motor cortex (M1), which has previously been considered to be responsible for the execution of movements planned in the premotor cortex, yet recent studies point to more complex roles for M1 in orchestrating motor-related information. The purpose of this project is to study the functional properties of primary motor cortex using ultra-high fMRI. The spatial resolution made possible by using a high field magnet allows us to investigate novel questions such as the existence of cortical columns, the functional organization pattern for single fingers and functional involvement of M1 in motor imagery and observation. Thirteen young healthy subjects participated in this study. Functional and anatomical high resolution images were acquired. Four functional scans were acquired for the different tasks: motor execution; motor imagery; movement observation and rest. The paradigm used was a randomized finger tapping. The images analysis was performed with the Brainvoyager QX program. Using the novel high resolution cortical grid sampling analysis tools, different cortical laminas of human M1 were examined. Our results reveal a distributed pattern (intermingled with somatotopic “hot spots”) for single fingers activity in M1. Furthermore we show novel evidence of columnar structures in M1 and show that non motor tasks such as motor imagery and action observation also activate this region. We conclude that the primary motor cortex has much more un-expected complex roles regarding the processing of movement related information, not only due to their involvement in tasks that do not imply muscle movement, but also due to their intriguing organization pattern

Repositório da Universidade Nova de Lisboa

A communal catalogue reveals Earth’s multiscale microbial diversity

Author: Thompson L R
Sanders Jon G
McDonald Daniel
Amir Amnon
Ladau Joshua
Locey Kenneth J
Prill Robert J
Tripathi Anupriya
Gibbons Sean M
Ackermann Gail
Navas-Molina Jose A
Janssen Stefan
Kopylova Evguenia
Vázquez-Baeza Yoshiki
González Antonio
Morton James T
Mirarab Siavash
Zech Xu Zhenjiang
Jiang Lingjing
Haroon Mohamed F
Kanbar Jad
Zhu Qiyun
Jin Song Se
Kosciolek Tomasz
Bokulich Nicholas A
Lefler Joshua
Brislawn Colin J
Humphrey Gregory
Owens Sarah M
Hampton-Marcell Jarrad
Berg-Lyons Donna
McKenzie Valerie
Fierer Noah
Fuhrman Jed A
Clauset Aaron
Stevens Rick L
Shade Ashley
Pollard Katherine S
Goodwin Kelly D
Jansson Janet K
Gilbert Jack A
Knight Rob
Rivera Jose L Agosto
Al-Moosawi Lisa
Alverdy John
Amato Katherine R
Andras Jason
Angenent Largus T
Antonopoulos Dionysios A
Apprill Amy
Armitage David
Ballantine Kate
Bárta Jirˇí
Baum Julia K
Berry Allison
Bhatnagar Ashish
Bhatnagar Monica
Biddle Jennifer F
Bittner Lucie
Boldgiv Bazartseren
Bottos Eric
Boyer Donal M
Braun Josephine
Brazelton William
Brearley Francis Q
Campbell Alexandra H
Caporaso J Gregory
Cardona Cesar
Carroll JoLynn
Cary S Craig
Casper Brenda B
Charles Trevor C
Chu Haiyan
Claar Danielle C
Clark Robert G
Clayton Jonathan B
Clemente Jose C
Cochran Alyssa
Coleman Maureen L
Collins Gavin
Colwell Rita R
Contreras Mónica
Crary Benjamin B
Creer Simon
Cristol Daniel A
Crump Byron C
Cui Duoying
Daly Sarah E
Davalos Liliana
Dawson Russell D
Defazio Jennifer
Delsuc Frédéric
Dionisi Hebe M
Dominguez-Bello Maria Gloria
Dowell Robin
Dubinsky Eric A
Dunn Peter O
Ercolini Danilo
Espinoza Robert E
Ezenwa Vanessa
Fenner Nathalie
Findlay Helen S
Fleming Irma D
Fogliano Vincenzo
Forsman Anna
Freeman Chris
Friedman Elliot S
Galindo Giancarlo
Garcia Liza
Garcia-Amado Maria Alexandra
Garshelis David
Gasser Robin B
Gerdts Gunnar
Gibson Molly K
Gifford Isaac
Gill Ryan T
Giray Tugrul
Gittel Antje
Golyshin Peter
Gong Donglai
Grossart Hans-Peter
Guyton Kristina
Haig Sarah-Jane
Hale Vanessa
Hall Ross Stephen
Hallam Steven J
Handley Kim M
Hasan Nur A
Haydon Shane R
Hickman Jonathan E
Hidalgo Glida
Hofmockel Kirsten S
Hooker Jeff
Hulth Stefan
Hultman Jenni
Hyde Embriette
Ibáñez-Álamo Juan Diego
Jastrow Julie D
Jex Aaron R
Johnson L Scott
Johnston Eric R
Joseph Stephen
Jurburg Stephanie D
Jurelevicius Diogo
Karlsson Anders
Karlsson Roger
Kauppinen Seth
Kellogg Colleen T E
Kennedy Suzanne J
Kerkhof Lee J
King Gary M
Kling George W
Koehler Anson V
Krezalek Monika
Kueneman Jordan
Lamendella Regina
Landon Emily M
Lane-deGraaf Kelly
LaRoche Julie
Larsen Peter
Laverock Bonnie
Lax Simon
Lentino Miguel
Levin Iris I
Liancourt Pierre
Liang Wenju
Linz Alexandra M
Lipson David A
Liu Yongqin
Lladser Manuel E
Lozada Mariana
Spirito Catherine M
MacCormack Walter P
MacRae-Crerar Aurora
Magris Magda
Martín-Platero Antonio M
Martín-Vivaldi Manuel
Martínez L Margarita
Martínez-Bueno Manuel
Marzinelli Ezequiel M
Mason Olivia U
Mayer Gregory D
McDevitt-Irwin Jamie M
McDonald James E
McGuire Krista L
McMahon Katherine D
McMinds Ryan
Medina Mónica
Mendelson Joseph R
Metcalf Jessica L
Meyer Folker
Michelangeli Fabian
Miller Kim
Mills David A
Minich Jeremiah
Mocali Stefano
Moitinho-Silva Lucas
Moore Anni
Morgan-Kiss Rachael M
Munroe Paul
Myrold David
Neufeld Josh D
Ni Yingying
Nicol Graeme W
Nielsen Shaun
Nissimov Jozef I
Niu Kefeng
Nolan Matthew J
Noyce Karen
O’Brien Sarah L
Okamoto Noriko
Orlando Ludovic
Castellano Yadira Ortiz
Osuolale Olayinka
Oswald Wyatt
Parnell Jacob
Peralta-Sánchez
Juan M
Petraitis Peter
Pfister Catherine
Pilon-Smits Elizabeth
Piombino Paola
Pointing Stephen B
Pollock F Joseph
Potter Caitlin
Prithiviraj Bharath
Quince Christopher
Rani Asha
Ranjan Ravi
Rao Subramanya
Rees Andrew P
Richardson Miles
Riebesell Ulf
Robinson Carol
Rockne Karl J
Rodriguezl Selena Marie
Rohwer Forest
Roundstone Wayne
Safran Rebecca J
Sangwan Naseer
Sanz Virginia
Schrenk Matthew
Schrenzel Mark D
Scott Nicole M
Seger Rita L
Seguin-Orlando Andaine
Seldin Lucy
Seyler Lauren M
Shakhsheer Baddr
Sheets Gabriela M
Shen Congcong
Shi Yu
Shin Hakdong
Shogan Benjamin D
Shutler Dave
Siegel Jeffrey
Simmons Steve
Sjöling Sara
Smith Daniel P
Soler Juan J
Sperling Martin
Steinberg Peter D
Stephens Brent
Stevens Melita A
Taghavi Safiyh
Tai Vera
Tait Karen
Tan Chia L
Tas Neslihan
Taylor D Lee
Thomas Torsten
Timling Ina
Turner Benjamin L
Urich Tim
Ursell Luke K
van der Lelie Daniel
Van Treuren William
van Zwieten Lukas
Vargas-Robles Daniela
Thurber Rebecca Vega
Vitaglione Paola
Walker Donald A
Walters William A
Wang Shi
Wang Tao
Weaver Tom
Webster Nicole S
Wehrle Beck
Weisenhorn Pamela
Weiss Sophie
Werner Jeffrey J
West Kristin
Whitehead Andrew
Whitehead Susan R
Whittingham Linda A
Willerslev Eske
Williams Allison E
Wood Stephen A
Woodhams Douglas C
Yang Yeqin
Zaneveld Jesse
Zarraonaindia Iratxe
Zhang Qikun
Zhao Hongxia
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/09/2000
Field of study

Our growing awareness of the microbial world’s importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth’s microbial diversity

A communal catalogue reveals Earth's multiscale microbial diversity

Author: Ackermann G
Al-Moosawi L
Alverdy J
Amato KR
Amir A
Andras J
Angenent LT
Antonopoulos DA
Apprill A
Armitage D
Ballantine K
Baum JK
Bárta J
Berg-Lyons D
Berry A
Bhatnagar A
Bhatnagar M
Biddle J
Bittner L
Bokulich NA
Boldgiv B
Bottos E
Boyer DM
Braun J
Brazelton W
Brearley Q
Brislawn CJ
Campbell AH
Caporaso JG
Cardona C
Carroll J
Cary SC
Casper BB
Castellano YO
Charles TC
Chu H
Claar DC
Clark RG
Clauset A
Clayton JB
Clemente JC
Cochran A
Coleman ML
Collins G
Colwell RR
Contreras M
Crary BB
Creer S
Cristol DA
Crump BC
Cui D
Daly SE
Davalos L
Dawson RD
Defazio J
Delsuc F
Dionisi HM
Dominguez-Bello MG
Dowell R
Dubinsky EA
Dunn PO
Ercolini D
Espinoza RE
Ezenwa V
Fenner N
Fierer N
Findlay HS
Fleming ID
Fogliano V
Forsman A
Freeman C
Friedman ES
Fuhrman JA
Galindo G
Garcia L
Garcia-Amado MA
Garshelis D
Gasser RB
Gerdts G
Gibbons SM
Gibson MK
Gifford I
Gilbert JA
Gill RT
Giray T
Gittel A
Golyshin P
Gong D
González A
Goodwin KD
Grossart H-P
Guyton K
Haig S-J
Hale V
Hall RS
Hallam SJ
Hampton-Marcell J
Handley KM
Haroon MF
Hasan NA
Haydon SR
Hickman JE
Hidalgo G
Hofmockel KS
Hooker J
Hulth S
Hultman J
Humphrey G
Hyde E
Ibáñez-Álamo JD
Janssen S
Jansson JK
Jastrow JD
Jex AR
Jiang L
Jin Song S
Johnson LS
Johnston ER
Joseph S
Jurburg SD
Jurelevicius D
Kanbar J
Karlsson A
Karlsson R
Kauppinen S
Kellogg CTE
Kennedy SJ
Kerkhof LJ
King GM
Kling GW
Knight R
Koehler AV
Kopylova E
Kosciolek T
Krezalek M
Kueneman J
Ladau J
Lamendella R
Landon EM
Lane-deGraaf K
LaRoche J
Larsen P
Laverock B
Lax S
Lefler J
Lentino M
Levin II
Liancourt P
Liang W
Linz AM
Lipson DA
Liu Y
Lladser ME
Locey KJ
Lozada M
MacCormack WP
MacRae-Crerar A
Magris M
Martín-Platero AM
Martín-Vivaldi M
Martínez LM
Martínez-Bueno M
Marzinelli EM
Mason OU
Mayer GD
McDevitt-Irwin JM
McDonald D
McDonald JE
McGuire KL
McKenzie V
McMahon KD
McMinds R
Medina M
Mendelson JR
Metcalf JL
Meyer F
Michelangeli F
Miller K
Mills DA
Minich J
Mirarab S
Mocali S
Moitinho-Silva L
Moore A
Morgan-Kiss RM
Morton JT
Munroe P
Myrold D
Navas-Molina JA
Neufeld JD
Ni Y
Nicol GW
Nielsen S
Nissimov JI
Niu K
Nolan MJ
Noyce K
Okamoto N
Orlando L
Osuolale O
Oswald W
Owens SM
O’Brien SL
Parnell J
Peralta-Sánchez JM
Petraitis P
Pfister C
Pilon-Smits E
Piombino P
Pointing SB
Pollard KS
Pollock FJ
Potter C
Prill RJ
Prithiviraj B
Quince C
Rani A
Ranjan R
Rao S
Rees AP
Richardson M
Riebesell U
Rivera JLA
Robinson C
Rockne KJ
Rodriguezl SM
Rohwer F
Roundstone W
Safran RJ
Sanders JG
Sangwan N
Sanz V
Schrenk M
Schrenzel MD
Scott NM
Seger RL
Seguin-Orlando A
Seldin L
Seyler LM
Shade A
Shakhsheer B
Sheets GM
Shen C
Shi Y
Shin H
Shogan BD
Shutler D
Siegel J
Simmons S
Sjöling S
Smith DP
Soler JJ
Sperling M
Spirito CM
Steinberg PD
Stephens B
Stevens MA
Stevens RL
Taghavi S
Tai V
Tait K
Tan CL
Tas¸ N
Taylor DL
Thomas T
Thompson LR
Thurber RV
Timling I
Tripathi A
Turner BL
Urich T
Ursell LK
van der Lelie D
Van Treuren W
van Zwieten L
Vargas-Robles D
Vitaglione P
Vázquez-Baeza Y
Walker DA
Walters WA
Wang S
Wang T
Weaver T
Webster NS
Wehrle B
Weisenhorn P
Weiss S
Werner JJ
West K
Whitehead A
Whitehead SR
Whittingham LA
Willerslev E
Williams AE
Wood SA
Woodhams DC
Yang Y
Zaneveld J
Zarraonaindia I
Zech Xu Z
Zhang Q
Zhao H
Zhu Q
Publication venue
Publication date: 01/01/2017
Field of study

Our growing awareness of the microbial world's importance and diversity contrasts starkly with our limited understanding of its fundamental structure. Despite recent advances in DNA sequencing, a lack of standardized protocols and common analytical frameworks impedes comparisons among studies, hindering the development of global inferences about microbial life on Earth. Here we present a meta-analysis of microbial community samples collected by hundreds of researchers for the Earth Microbiome Project. Coordinated protocols and new analytical methods, particularly the use of exact sequences instead of clustered operational taxonomic units, enable bacterial and archaeal ribosomal RNA gene sequences to be followed across multiple studies and allow us to explore patterns of diversity at an unprecedented scale. The result is both a reference database giving global context to DNA sequence data and a framework for incorporating data from future studies, fostering increasingly complete characterization of Earth's microbial diversity.Peer reviewe