Search CORE

147 research outputs found

Genome assembly forensics: finding the elusive mis-assembly

Author: Phillippy Adam M
Pop Mihai
Schatz Michael C
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

A collection of software tools is combined for the first time in an automated pipeline for detecting large-scale genome assembly errors and for validating genome assemblies

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

The rise of a digital immune system

Author: Phillippy Adam M
Schatz Michael C
Publication venue: Springer Nature
Publication date: 01/07/2012
Field of study

Driven by million-fold improvements in biotechnology, biology is increasingly shifting towards high-resolution, quantitative approaches to study the molecular dynamics of entire populations. One exciting application enabled by this new era of biology is the “digital immune system”. It would work in much the same way as an adaptive, biological immune system: by observing the microbial landscape, detecting potential threats, and neutralizing them before they spread beyond control. With the potential to have an enormous impact on public health, it is time to integrate the necessary biotechnology, computational, and organizational systems to seed the development of a global, sequencing-based pathogen surveillance system.https://doi.org/10.1186/2047-217X-1-

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Repository at the University of Maryland

The evolution of the natural killer complex; a comparison between mammals using new high-quality genome assemblies and targeted annotation.

Author: Bickhart Derek M
Gibson Mark S
Hammond John A
Heimeier Dorothea
Koren Sergey
Medrano Juan F
Phillippy Adam M
Schwartz John C
Smith Timothy PL
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

Natural killer (NK) cells are a diverse population of lymphocytes with a range of biological roles including essential immune functions. NK cell diversity is in part created by the differential expression of cell surface receptors which modulate activation and function, including multiple subfamilies of C-type lectin receptors encoded within the NK complex (NKC). Little is known about the gene content of the NKC beyond rodent and primate lineages, other than it appears to be extremely variable between mammalian groups. We compared the NKC structure between mammalian species using new high-quality draft genome assemblies for cattle and goat; re-annotated sheep, pig, and horse genome assemblies; and the published human, rat, and mouse lemur NKC. The major NKC genes are largely in the equivalent positions in all eight species, with significant independent expansions and deletions between species, allowing us to propose a model for NKC evolution during mammalian radiation. The ruminant species, cattle and goats, have independently evolved a second KLRC locus flanked by KLRA and KLRJ, and a novel KLRH-like gene has acquired an activating tail. This novel gene has duplicated several times within cattle, while other activating receptor genes have been selectively disrupted. Targeted genome enrichment in cattle identified varying levels of allelic polymorphism between the NKC genes concentrated in the predicted extracellular ligand-binding domains. This novel recombination and allelic polymorphism is consistent with NKC evolution under balancing selection, suggesting that this diversity influences individual immune responses and may impact on differential outcomes of pathogen infection and vaccination

Springer - Publisher Connector

PubMed Central

Repositório da Universidade Nova de Lisboa

eScholarship - University of California

Probing the pan-genome of Listeria monocytogenes: new insights into intraspecific niche expansion and genomic diversification

Author: Deng Xiangyu
Li Zengxin
Phillippy Adam M
Salzberg Steven L
Zhang Wei
Publication venue: Springer Nature
Publication date: 01/01/2010
Field of study

Bacterial pathogens often show significant intraspecific variations in ecological fitness, host preference and pathogenic potential to cause infectious disease. The species of Listeria monocytogenes, a facultative intracellular pathogen and the causative agent of human listeriosis, consists of at least three distinct genetic lineages. Two of these lineages predominantly cause human sporadic and epidemic infections, whereas the third lineage has never been implicated in human disease outbreaks despite its overall conservation of many known virulence factors. Here we compare the genomes of 26 L. monocytogenes strains representing the three lineages based on both in silico comparative genomic analysis and high-density, pan-genomic DNA array hybridizations. We uncover 86 genes and 8 small regulatory RNAs that likely make L. monocytogenes lineages differ in carbohydrate utilization and stress resistance during their residence in natural habitats and passage through the host gastrointestinal tract. We also identify 2,330 to 2,456 core genes that define this species along with an open pan-genome pool that contains more than 4,052 genes. Phylogenomic reconstructions based on 3,560 homologous groups allowed robust estimation of phylogenetic relatedness among L. monocytogenes strains. Our pan-genome approach enables accurate co-analysis of DNA sequence and hybridization array data for both core gene estimation and phylogenomics. Application of our method to the pan-genome of L. monocytogenes sheds new insights into the intraspecific niche expansion and evolution of this important foodborne pathogen.https://doi.org/10.1186/1471-2164-11-50

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

Automated ensemble assembly and validation of microbial genomes

Author: Hill Christopher M
Koren Sergey
Phillippy Adam M
Pop Mihai
Treangen Todd J
Publication venue: Springer Nature
Publication date: 01/01/2014
Field of study

The continued democratization of DNA sequencing has sparked a new wave of development of genome assembly and assembly validation methods. As individual research labs, rather than centralized centers, begin to sequence the majority of new genomes, it is important to establish best practices for genome assembly. However, recent evaluations such as GAGE and the Assemblathon have concluded that there is no single best approach to genome assembly. Instead, it is preferable to generate multiple assemblies and validate them to determine which is most useful for the desired analysis; this is a labor-intensive process that is often impossible or unfeasible. To encourage best practices supported by the community, we present iMetAMOS, an automated ensemble assembly pipeline; iMetAMOS encapsulates the process of running, validating, and selecting a single assembly from multiple assemblies. iMetAMOS packages several leading open-source tools into a single binary that automates parameter selection and execution of multiple assemblers, scores the resulting assemblies based on multiple validation metrics, and annotates the assemblies for genes and contaminants. We demonstrate the utility of the ensemble process on 225 previously unassembled Mycobacterium tuberculosis genomes as well as a Rhodobacter sphaeroides benchmark dataset. On these real data, iMetAMOS reliably produces validated assemblies and identifies potential contamination without user intervention. In addition, intelligent parameter selection produces assemblies of R. sphaeroides comparable to or exceeding the quality of those from the GAGE-B evaluation, affecting the relative ranking of some assemblers. Ensemble assembly with iMetAMOS provides users with multiple, validated assemblies for each genome. Although computationally limited to small or mid-sized genomes, this approach is the most effective and reproducible means for generating high-quality assemblies and enables users to select an assembly best tailored to their specific needs.https://doi.org/10.1186/1471-2105-15-12

Crossref

Springer - Publisher Connector

PubMed Central

Digital Repository at the University of Maryland

Assemblathon 1: A competitive assessment of de novo short read assembly methods

Author: Chapman Jarrod A.
Earl Dent A.
et al.
Ho Isaac Y.
Huang Xiaoqiu
Koren Sergey
Phillippy Adam M.
Rokhsar Daniel S.
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2011
Field of study

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/

Digital Repository @ Iowa State University (ISU)

High-quality metagenome assembly from long accurate reads with metaMDBG

Author: Benoit Gaëtan
Chikhi Rayan
James Robert
Phillippy Adam M.
Quince Christopher
Raguideau Sébastien
Publication venue
Publication date: 01/09/2024
Field of study

We introduce metaMDBG, a metagenomics assembler for PacBio HiFi reads. MetaMDBG combines a de Bruijn graph assembly in a minimizer space with an iterative assembly over sequences of minimizers to address variations in genome coverage depth and an abundance-based filtering strategy to simplify strain complexity. For complex communities, we obtained up to twice as many high-quality circularized prokaryotic metagenome-assembled genomes as existing methods and had better recovery of viruses and plasmids

University of East Anglia digital repository

Complex Microbiome Underlying Secondary and Primary Metabolism in the Tunicate-\u3cem\u3eProchloron\u3c/em\u3e Symbiosis

Author: Cox James
Donia Mohamed S.
Elshahawi Sherif
Fricke W. Florian
Haygood Martha G.
Partensky Frédéric
Phillippy Adam M.
Piel Joern
Ravel Jacques
Schatz Michael C.
Schmidt Eric W.
White James R.
Publication venue: Chapman University Digital Commons
Publication date: 20/12/2011
Field of study

The relationship between tunicates and the uncultivated cyanobacterium Prochloron didemni has long provided a model symbiosis. P. didemni is required for survival of animals such as Lissoclinum patella and also makes secondary metabolites of pharmaceutical interest. Here, we present the metagenomes, chemistry, and microbiomes of four related L. patella tunicate samples from a wide geographical range of the tropical Pacific. The remarkably similar P. didemni genomes are the most complex so far assembled from uncultivated organisms. Although P. didemni has not been stably cultivated and comprises a single strain in each sample, a complete set of metabolic genes indicates that the bacteria are likely capable of reproducing outside the host. The sequences reveal notable peculiarities of the photosynthetic apparatus and explain the basis of nutrient exchange underlying the symbiosis. P. didemni likely profoundly influences the lipid composition of the animals by synthesizing sterols and an unusual lipid with biofuel potential. In addition, L. patella also harbors a great variety of other bacterial groups that contribute nutritional and secondary metabolic products to the symbiosis. These bacteria possess an enormous genetic potential to synthesize new secondary metabolites. For example, an antitumor candidate molecule, patellazole, is not encoded in the genome of Prochloron and was linked to other bacteria from the microbiome. This study unveils the complex L. patella microbiome and its impact on primary and secondary metabolism, revealing a remarkable versatility in creating and exchanging small molecules

Chapman University Digital Commons

Balancing openness with Indigenous data sovereignty: An opportunity to leave no one behind in the journey to sequence all of life

Author: Anderson Jane
Anderson Matthew Z.
Cartney Ann M. Mc
Cook-Deegan Robert
Geary Janis
Hudson Maui
Liggins Libby
Patel Hardip R.
Phillippy Adam M.
TeAika Ben
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2022
Field of study

The field of genomics has benefited greatly from its "openness" approach to data sharing. However, with the increasing volume of sequence information being created and stored and the growing number of international genomics efforts, the equity of openness is under question. The United Nations Convention of Biodiversity aims to develop and adopt a standard policy on access and benefit-sharing for sequence information across signatory parties. This standardization will have profound implications on genomics research, requiring a new definition of open data sharing. The redefinition of openness is not unwarranted, as its limitations have unintentionally introduced barriers of engagement to some, including Indigenous Peoples. This commentary provides an insight into the key challenges of openness faced by the researchers who aspire to protect and conserve global biodiversity, including Indigenous flora and fauna, and presents immediate, practical solutions that, if implemented, will equip the genomics community with both the diversity and inclusivity required to respectfully protect global biodiversity

Massey Research Online

Research Commons@Waikato

PubMed Central

Recommended from our members

De novo assembly of the cattle reference genome with single-molecule sequencing.

Author: Bickhart Derek M
Cole John B
Couldrey Christine
Dreischer Christian
Elsik Christine G
Ghurye Jay
Hagen Darren E
Hall Richard
Hammond John A
Hoffman Jinna
Koren Sergey
Li Wenli
Liu George
Low Wai Y
McDaneld Tara G
McKay Stephanie D
Medrano Juan F
Murdoch Brenda M
Nandolo Wilson
Phillippy Adam M
Rhie Arang
Rosen Benjamin D
Rowan Troy N
Schnabel Robert D
Schroeder Steven G
Schultheiss Sebastian J
Schwartz John C
Smith Timothy PL
Snelling Warren M
Thibaud-Nissen Françoise
Tseng Elizabeth
Van Tassell Curtis P
Zimin Aleksey
Publication venue: eScholarship, University of California
Publication date: 01/03/2020
Field of study

BackgroundMajor advances in selection progress for cattle have been made following the introduction of genomic tools over the past 10-12 years. These tools depend upon the Bos taurus reference genome (UMD3.1.1), which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies.ResultsWe present the new reference genome for cattle, ARS-UCD1.2, based on the same animal as the original to facilitate transfer and interpretation of results obtained from the earlier version, but applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly includes 2.7 Gb and is >250× more continuous than the original assembly, with contig N50 >25 Mb and L50 of 32. We also greatly expanded supporting RNA-based data for annotation that identifies 30,396 total genes (21,039 protein coding). The new reference assembly is accessible in annotated form for public use.ConclusionsWe demonstrate that improved continuity of assembled sequence warrants the adoption of ARS-UCD1.2 as the new cattle reference genome and that increased assembly accuracy will benefit future research on this species

eScholarship - University of California