Search CORE

17 research outputs found

A comparative analysis of exome capture

Author: Grabill Ian
Iossifov Ivan
Kramer Melissa
McCombie W Richard
Parla Jennifer S
Spector Mona S
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

ABSTRACT: BACKGROUND: Human exome resequencing using commercial target capture kits has been and is being used for sequencing large numbers of individuals to search for variants associated with various human diseases. We rigorously evaluated the capabilities of two solution exome capture kits. These analyses help clarify the strengths and limitations of those data as well as systematically identify variables that should be considered in the use of those data. RESULTS: Each exome kit performed well at capturing the targets they were designed to capture, which mainly corresponds to the consensus coding sequences (CCDS) annotations of the human genome. In addition, based on their respective targets, each capture kit coupled with high coverage Illumina sequencing produced highly accurate nucleotide calls. However, other databases, such as the Reference Sequence collection (RefSeq), define the exome more broadly, and so not surprisingly, the exome kits did not capture these additional regions. CONCLUSIONS: Commercial exome capture kits provide a very efficient way to sequence select areas of the genome at very high accuracy. Here we provide the data to help guide critical analyses of sequencing data derived from these products

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Establishing the baseline level of repetitive element expression in the human cortex

Author: B Conrad
B Langmead
BT Wilhelm
C Ladd-Acosta
C Nellaker
D Karolchik
DE Montoya-Durango
E Balada
GF Richard
GJ Faulkner
HH Kazazian
HH Kazazian
JA Armour
Jennifer Parla
JL Weber
JR Landry
JR Landry
JR Landry
M Barak
M Lafon
Melissa Kramer
O Frank
P Jern
PA Callinan
R Cordaux
R Lower
Robert H Yolken
RS Harris
S Mi
S Weis
Sarah J Wheelan
Sarven Sabunciyan
SJ Wheelan
Svitlana Tyekucheva
W Richard McCombie
WA Schulz
YHY Benjamini
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: Although nearly half of the human genome is comprised of repetitive sequences, the expression profile of these elements remains largely uncharacterized. Recently developed high throughput sequencing technologies provide us with a powerful new set of tools to study repeat elements. Hence, we performed whole transcriptome sequencing to investigate the expression of repetitive elements in human frontal cortex using postmortem tissue obtained from the Stanley Medical Research Institute. Results: We found a significant amount of reads from the human frontal cortex originate from repeat elements. We also noticed that Alu elements were expressed at levels higher than expected by random or background transcription. In contrast, L1 elements were expressed at lower than expected amounts. Conclusions: Repetitive elements are expressed abundantly in the human brain. This expression pattern appears to be element specific and can not be explained by random or background transcription. These results demonstrate that our knowledge about repetitive elements is far from complete. Further characterization is required to determine the mechanism, the control, and the effects of repeat element expression

Crossref

Cold Spring Harbor Laboratory Institutional Repository

Springer - Publisher Connector

PubMed Central

Eight disease etiologies used in simulation experiments.

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

Rare variant = disease caused by multiple rare deleterious variants. Low frequency variant = disease caused by multiple low frequency deleterious variants. Key Region variant = rare deleterious variants are localized to key regions. Common variant = disease caused by a single deleterious common variant. The etiologies Rare+Protect, LowFreq+Protect, KeyRegion+Protect and Common+Protect were identical to the first four except that they include protective variants.1Minor allele frequency of deleterious causal variants,2Selection coefficients of deleterious causal variants,3Effect size of deleterious causal variants,4Selection coefficient of protective causal variants,5Effect size of protective modifier variants,6Required functional role of causal and protective variants, NS = coding non-synonymous, AA = African-American simple bottleneck demographic model <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen.1003224-Boyko1" target="_blank">[44]</a>, EA = European-American exponential growth demographic model <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen.1003224-Kryukov1" target="_blank">[19]</a>).* for protective modifier variants with AF5%, for protective modifier variants with AF5%.</p

FigShare

Power estimates for multiple gene case-control studies with causal variants equally likely to be from any disease etiology dominated by rare variants.

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

A,B. X-axis shows number of candidate genes in 250 simulated case-control studies (approximately one-third each from disease etiologies Rare, LowFreq and KeyRegion). All genes contain causal variants. For each method, average power is shown. Power increases for all methods as the number of candidate genes with causal variants increases. C,D. X-axis shows the number of candidate genes and the ratio of genes containing causal variants to those that do not contain causal variants. As the ratio decreases, the power of the tested methods also decreases. (Tested methods are BOMP, VT, SKAT and KBAC1P = minor allele frequency defined as , KBAC5P = minor allele frequency defined as ). AA = the case-control studies were drawn from gene populations generated with an African-American simple bottleneck demographic model. EA = the case-control studies were drawn from gene populations generated with a European-American exponential growth demographic model.)</p

FigShare

BOMP P-values for gene sets in Bipolar case-control study.

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

The gene sets were selected for testing because they contained genes and were the most significantly enriched by synaptic genes <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen.1003224-Pirooznia1" target="_blank">[26]</a>. Seven of the genes sets were nominally associated with bipolar disorder (P0.05) and have FDR0.1.*FDR computed with the Benjamini-Hochberg algorithm <a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen.1003224-Benjamini1" target="_blank">[45]</a>.**Wall-clock time in minutes.</p

FigShare

BOMP burden and position statistics complement each other.

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

Breakdown of contribution of BOMP mutation burden (BOMP_B) and BOMP position distribution (BOMP_P) statistics averaged over single candidate gene power estimates (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen-1003224-g001" target="_blank">Figure 1</a>) and multiple candidate gene power estimates (nine genes, 3 with causal variants and 6 with no causal variants) (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen-1003224-g003" target="_blank">Figure 3</a>) for case-control study sizes of 200, 1000, 2000, and 5000. Combining the two statistics consistently yielded improved power with respect to each statistic on its own. The BOMP burden statistic had more power than BOMP position for the simulations based on a single candidate gene, and vice versa in the simulations with nine candidate genes and 3∶6 causal to non-causal ratio.</p

FigShare

Power estimates for multiple genes case-control studies with causal variants from disease etiologies randomly sampled from nine multinomial distributions (Figure S3).

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

Power estimates for BOMP, VT, SKAT, KBAC (KBAC1P = minor allele frequency defined as , KBAC5P = minor allele frequency defined as ). Each vertical line represents power estimates for each method, based on 250 simulated case-control studies. The genomic individuals each had nine genes, of which three contained causal variants and six did not. The disease etiologies for the three genes with causal variants were randomly sampled from nine multinomial distributions (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen.1003224.s003" target="_blank">Figure S3</a>). AA = African-American simple bottleneck demographic model. EA = European-American exponential growth demographic model.</p

FigShare

Analytical comparison of SKAT, BOMP, and VT on a toy example.

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

Genotypes of 8 cases and 8 controls at 10 positions. Matrix column colors: controls = light blue, cases = light red. Position distribution bar colors: controls = blue, cases = red. Detailed description is in the section “Toy example with analytical calculations” (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1003224#pgen.1003224.s013" target="_blank">Text S1</a>).</p

FigShare

Components of BOMP Hybrid Likelihood Model compared.

Author: Fernando S. Goes (276510)
Hannah Carter (276508)
James B. Potash (171581)
Jennifer Parla (276509)
Mehdi Pirooznia (23876)
Melissa Kramer (257074)
Peter P. Zandi (215438)
Rachel Karchin (61264)
W. Richard McCombie (26862)
Yun-Ching Chen (276507)
Publication venue
Publication date
Field of study

A. Mutation burden statistic. The Mutation burden statistic uses the aggregated burden for cases, , and controls . B. Mutation position distribution statistic. Aggregated window mutation counts are calculated for cases, , controls, , and cases and controls combined, , across windows.</p

FigShare