Search CORE

12 research outputs found

Phenotype Sequencing: Identifying the Genes That Cause a Phenotype Directly from Pooled Sequencing of Independent Mutants

Author: A Futschik
A Srivatsan
C Herring
C Honisch
C Lee
CE Bonferroni
Christopher J. Lee
D Lee
D Smith
E Jones
G Velicer
H Li
Iara M. P. Machado
J Barrick
J Barrick
J Cridland
J Klockgether
J Miller
J Ohnishi
James C. Liao
JL Cocchiaro
K Holt
K McKernan
Marc A. Harper
O Harismendy
P Chen
P Cock
Raphael Valdivia
S Atsumi
S Atsumi
S Atsumi
S Le Crom
Stanley F. Nelson
T Conrad
T Hanai
Traci Toy
Zugen Chen
Publication venue: Public Library of Science
Publication date: 18/02/2011
Field of study

Random mutagenesis and phenotype screening provide a powerful method for dissecting microbial functions, but their results can be laborious to analyze experimentally. Each mutant strain may contain 50–100 random mutations, necessitating extensive functional experiments to determine which one causes the selected phenotype. To solve this problem, we propose a “Phenotype Sequencing” approach in which genes causing the phenotype can be identified directly from sequencing of multiple independent mutants. We developed a new computational analysis method showing that 1. causal genes can be identified with high probability from even a modest number of mutant genomes; 2. costs can be cut many-fold compared with a conventional genome sequencing approach via an optimized strategy of library-pooling (multiple strains per library) and tag-pooling (multiple tagged libraries per sequencing lane). We have performed extensive validation experiments on a set of E. coli mutants with increased isobutanol biofuel tolerance. We generated a range of sequencing experiments varying from 3 to 32 mutant strains, with pooling on 1 to 3 sequencing lanes. Our statistical analysis of these data (4099 mutations from 32 mutant genomes) successfully identified 3 genes (acrB, marC, acrA) that have been independently validated as causing this experimental phenotype. It must be emphasized that our approach reduces mutant sequencing costs enormously. Whereas a conventional genome sequencing experiment would have cost

7,200 in reagents alone, our Phenotype Sequencing design yielded the same information value for only

1200. In fact, our smallest experiments reliably identified acrB and marC at a cost of only

110–

340

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Evaluation of Clonal Hematopoiesis in Pediatric ADA-SCID Gene Therapy Participants

Author: Carroll Judith E
Chang Vivian Y
Davila Alejandra
Fernandez Beatriz Campo
Kohn Donald B
Lee Thomas Domin
Polsky Lilian
Toy Traci
White Shanna L
Publication venue: eScholarship, University of California
Publication date: 03/08/2022
Field of study

Autologous stem cell transplant with gene therapy (ASCT-GT) provides curative therapy while reducing pretransplant immune-suppressive conditioning and eliminating posttransplant immune suppression. Clonal hematopoiesis of indeterminate potential (CHIP)-associated mutations increase and telomere lengths (TLs) shorten with natural aging and DNA damaging processes. It is possible that, if CHIP is present before ASCT-GT or mutagenesis occurs after busulfan exposure, the hematopoietic stem cells carrying these somatic variants may survive the conditioning chemotherapy and have a selective reconstitution advantage, increasing the risk of hematologic malignancy and overall mortality. Seventy-four peripheral blood samples (ranging from baseline to 120 months after ASCT-GT) from 10 pediatric participants who underwent ASCT-GT for adenosine deaminase-deficient severe combined immune deficiency (ADA-SCID) after reduced-intensity conditioning with busulfan and 16 healthy controls were analyzed for TL and CHIP. One participant had a significant decrease in TL. There were no CHIP-associated mutations identified by the next-generation sequencing in any of the ADA-SCID participants. This suggests that further studies are needed to determine the utility of germline analyses in revealing the underlying genetic risk of malignancy in participants who undergo gene therapy. Although these results are promising, larger scale studies are needed to corroborate the effect of ASCT-GT on TL and CHIP. This trial was registered at www.clinicaltrials.gov as #NCT00794508

PubMed Central

eScholarship - University of California

Phenotype sequencing of 32 isobutanol tolerant E. coli strains (top 21 hits by raw SNP counts).

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

Phenotype sequencing of 32 isobutanol tolerant E. coli strains (top 21 hits by raw SNP counts).</p

FigShare

Schematic diagram of phenotype sequencing and key parameters.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

Overview of phenotype sequencing stages: mutagenesis, screening, and sequencing. Conventional unpooled sequencing of individual strains (left), is contrasted with pooled sequencing of multiple strains per library (right), comparing the expected frequency of observation of a real mutation in each case.</p

FigShare

Modeled vs. experimental target gene yield as a function of increasing number of strains sequenced.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

A. Bioinformatic model of expected yield for discovery of 3 target genes, as a function of increasing number of strains sequenced, plotted vs. experiment cost, assuming one lane of sequencing at a cost of

37.50 per sequenced strain. <b>B</b>. Experimentally measured target gene discovery yields as a function of number of strains sequenced, plotted vs. experiment cost. Each data point is the average of all sub-experiments containing that number of strains; the error bar gives the standard error for this average from that set of sub-experiments. red line (inverted triangles): one lane of sequencing (32x coverage per library); blue line (+ signs): three lanes of sequencing (96x coverage per library, resulting in a total cost of

81.25 per strain).</p

FigShare

Effects of sequencing error and pooling on average target gene discovery yields.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

A. The probability of reporting a SNP at a single site as a function of the mutation call threshold (read counts) assuming a coverage of c = 75, due either to sequencing error (red), or a real mutation (green), assuming a 1% sequencing error rate and a 25% true mutation fraction (i.e. library-pooling factor of P = 4). Circles indicate the expected mean read counts on each plot. B. The expected number of total mutation calls per genome as a function of the mutation call threshold, due either to sequencing error (red), or a real mutation (green), assuming a 4 Mb genome size. The dashed red line indicates the lowest mutation call threshold at which the number of false positive mutation calls falls below one. The dashed green line indicates the maximum mutation call threshold at which the number of false negatives remains less than one. C. The average number of true target genes discovered (at an FDR <0.67) as a function of the mutation call threshold, for different library-pooling levels P = 2 to P = 9, assuming sequencing of 80 mutant strains with a mutation density of 50 mutations per genome, and 20 true target genes.</p

FigShare

Effect of uniform vs. non-uniform gene size distributions on p-value scoring.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

Uniform gene-size model (blue circles, dashed line); Variable gene-size model based on subdividing the E. coli gene size distribution into ten size classes, each containing 424 genes represented by the average size within that class (green + markers); Variable gene-size model based on the exact sizes of all 4244 E coli genes (red line).</p

FigShare

Top 20 hits ranked by Bonferroni corrected p-value computed on all SNPs.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

Top 20 hits ranked by Bonferroni corrected p-value computed on all SNPs.</p

FigShare

Top 20 hits ranked by Bonferroni corrected p-value computed on non-synonymous SNPs.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

Top 20 hits ranked by Bonferroni corrected p-value computed on non-synonymous SNPs.</p

FigShare

Target discovery yield as a function of mutations per strain and number of strains sequenced.

Author: Christopher J. Lee (255743)
Iara M. P. Machado (355850)
James C. Liao (184646)
Marc A. Harper (355848)
Stanley F. Nelson (33880)
Traci Toy (355849)
Zugen Chen (76247)
Publication venue
Publication date
Field of study

A. For five target genes. Gray color (upper-left corner) represents discovery of all 5 targets; red = zero targets. B. For ten target genes. Gray represents discovery of all 10 targets. C. For twenty target genes. Gray represents discovery of all 20 targets.</p

FigShare