Search CORE

10 research outputs found

Basic components of the iterative workflow as compared to a standard NGS whole genome analysis.

Author: Jean-Pierre A. Kocher (184636)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Sumit Middha (93228)
Publication venue
Publication date
Field of study

Basic components of the iterative workflow as compared to a standard NGS whole genome analysis.</p

FigShare

Evaluation of SNVs and Indels called by the iterative and standard workflow.

Author: Jean-Pierre A. Kocher (184636)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Sumit Middha (93228)
Publication venue
Publication date
Field of study

Evaluation of SNVs and Indels called by the iterative and standard workflow.</p

FigShare

Concordance of SNP data with variants from standard and iterative workflows for sample NA12878.

Author: Jean-Pierre A. Kocher (184636)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Sumit Middha (93228)
Publication venue
Publication date
Field of study

Concordance of SNP data with variants from standard and iterative workflows for sample NA12878.</p

FigShare

SoftSearch: Integration of Multiple Sequence Features to Identify Breakpoints of Structural Variations

Author: Fergus J. Couch (146607)
Jaysheel D. Bhavsar (497995)
Jean-Pierre A. Kocher (184636)
Raymond Moore (497994)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Vivekananda Sarangi (497993)
Publication venue
Publication date: 16/12/2013
Field of study

<div>BackgroundStructural variation (SV) represents a significant, yet poorly understood contribution to an individual’s genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. ResultsWe developed and validated SoftSearch using real and synthetic datasets. SoftSearch’s key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. ConclusionsWe show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance. </div

Directory of Open Access Journals

PubMed Central

FigShare

Example IGV screenshot of a 71bp tandem duplication in the BRCA2 gene identified by SoftSearch.

Author: Fergus J. Couch (146607)
Jaysheel D. Bhavsar (497995)
Jean-Pierre A. Kocher (184636)
Raymond Moore (497994)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Vivekananda Sarangi (497993)
Publication venue
Publication date
Field of study

Discordant reads are blue (plus strand) or red (minus strand). Soft clipped bases appear as multicolour “rainbows”.</p

FigShare

Overlap of true positive calls for the NA12878 and NA18507 datasets.

Author: Fergus J. Couch (146607)
Jaysheel D. Bhavsar (497995)
Jean-Pierre A. Kocher (184636)
Raymond Moore (497994)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Vivekananda Sarangi (497993)
Publication venue
Publication date
Field of study

Overlap of true positive calls for the NA12878 and NA18507 datasets.</p

FigShare

The general strategy for SoftSearch.

Author: Fergus J. Couch (146607)
Jaysheel D. Bhavsar (497995)
Jean-Pierre A. Kocher (184636)
Raymond Moore (497994)
Saurabh Baheti (479040)
Steven N. Hart (497992)
Vivekananda Sarangi (497993)
Publication venue
Publication date
Field of study

A) Left clipped reads are defined as where the clipped portion of the read is at a smaller genome coordinate than the opposite end (opposite for right clipping). For a left clipped read located on the “+” strand, SoftSearch looks upstream for a discordant read pair where the read is oriented in the “-” direction. The orientation and location of the mate is where SoftSearch links the first region to. To increase the likelihood of exactly detecting the breakpoint, it then looks upstream for a right clipped read cluster. If none is found, then the default breakpoint location is the discordant read mate location; otherwise it is the position of soft clipping at the right clipped read. B) SoftSearch determines discordant read pairs by their insert size and orientation and places them in a temporary BAM file. It also reads the input BAM file for soft clipped reads and converts them to a BED file. Overlapping soft clip locations are counted to identify putative breakpoints, and then queried against the discordant read pair bam file for properly oriented supporting reads, which are then output in VCF format.</p

FigShare

Mutational Landscapes of Sequential Prostate Metastases and Matched Patient Derived Xenografts during Enzalutamide Therapy

<div>Developing patient derived models from individual tumors that capture the biological heterogeneity and mutation landscape in advanced prostate cancer is challenging, but essential for understanding tumor progression and delivery of personalized therapy in metastatic castrate resistant prostate cancer stage. To demonstrate the feasibility of developing patient derived xenograft models in this stage, we present a case study wherein xenografts were derived from cancer metastases in a patient progressing on androgen deprivation therapy and prior to initiating pre-chemotherapy enzalutamide treatment. Tissue biopsies from a metastatic rib lesion were obtained for sequencing before and after initiating enzalutamide treatment over a twelve-week period and also implanted subcutaneously as well as under the renal capsule in immuno-deficient mice. The genome and transcriptome landscapes of xenografts and the original patient tumor tissues were compared by performing whole exome and transcriptome sequencing of the metastatic tumor tissues and the xenografts at both time points. After comparing the somatic mutations, copy number variations, gene fusions and gene expression we found that the patient’s genomic and transcriptomic alterations were preserved in the patient derived xenografts with high fidelity. These xenograft models provide an opportunity for predicting efficacy of existing and potentially novel drugs that is based on individual metastatic tumor expression signature and molecular pharmacology for delivery of precision medicine.</div

Directory of Open Access Journals

PubMed Central

FigShare

Comparison of genome and transcriptome landscapes between patient tumor tissue and patient derived xenograft models (PDXs).

(A) Recall (grey) and precision (tan) of detected somatic mutations. Recall = number of somatic mutations called from both patient tissue and xenograft divided by total somatic mutations called from patient tissue; Precision = number of somatic mutations called from both patient tissue and xenograft divided by total mutations called from xenograft. (B) Heatmap showing the concordance of relative allele frequency of somatic mutations between patient tissues and xenografts. Rows correspond to patients and xenograft samples and columns correspond to 60 selected somatic mutations. (C) Circos plot showing profiles of copy number variation for patient tumor tissues and xenografts. From outside to inside, tracks correspond to 1 = V1/met, 1A = V1/xeno/A, 1B = V1/xeno/B, 1C = V1/xeno/C, 1AA = V1/xeno/A/A, 1BA = V1/xeno/B/A, V1/xeno/C/A, V2/met, V2/xeno/A, V2/xeno/A/A1, V2/xeno/A/A2 and V2/xeno/A/B. (D) Pair-wise gene expression correlation between patient tissues and xenografts. Correlation was measured by Pearson correlation coefficient. Gene expression was measured by log10 (RPKM, Reads Per Kilobase exon per Million mapped reads).</p

FigShare

Classification of xenograft whole exome sequencing (panels A and B) and RNA sequencing reads (panels C and D).

Reads generated from xenograft samples were divided into five groups including “graft”, “host”, “both”, “neither” and “ambiguous” using tool developed by Conway et al. (A) Reads assignments for 10 xenograft whole exome sequencing data. (B) Average proportion of whole exome sequencing reads assigned to the 5 groups mentioned above. (C) Reads assignments for 10 xenograft RNA-seq data. (D) Average proportion of RNA-seq reads assigned to the 5 groups mentioned above.</p

FigShare