50,379 research outputs found
Development and Validation of Clinical Whole-Exome and Whole-Genome Sequencing for Detection of Germline Variants in Inherited Disease
Context.-With the decrease in the cost of sequencing, the clinical testing paradigm has shifted from single gene to gene panel and now whole-exome and whole-genome sequencing. Clinical laboratories are rapidly implementing next-generation sequencing-based whole-exome and whole-genome sequencing. Because a large number of targets are covered by whole-exome and whole-genome sequencing, it is critical that a laboratory perform appropriate validation studies, develop a quality assurance and quality control program, and participate in proficiency testing. Objective.-To provide recommendations for wholeexome and whole-genome sequencing assay design, validation, and implementation for the detection of germline variants associated in inherited disorders. Data Sources.-An example of trio sequencing, filtration and annotation of variants, and phenotypic consideration to arrive at clinical diagnosis is discussed. Conclusions.-It is critical that clinical laboratories planning to implement whole-exome and whole-genome sequencing design and validate the assay to specifications and ensure adequate performance prior to implementation. Test design specifications, including variant filtering and annotation, phenotypic consideration, guidance on consenting options, and reporting of incidental findings, are provided. These are important steps a laboratory must take to validate and implement whole-exome and whole-genome sequencing in a clinical setting for germline variants in inherited disorders
Development and performance of a targeted whole exome sequencing enrichment kit for the dog (Canis Familiaris Build 3.1)
Whole exome sequencing is a technique that aims to selectively sequence all exons of protein-coding genes. A canine whole exome sequencing enrichment kit was designed based on the latest canine reference genome (build 3.1.72). Its performance was tested by sequencing 2 exome captures, each consisting of 4 pre-capture pooled, barcoded Illumina libraries on an Illumina HiSeq 2500. At an average sequencing depth of 102x, 83 to 86% of the target regions were completely sequenced with a minimum coverage of five and 90% of the reads mapped on the target regions. Additionally, it is shown that the reproducibility within and between captures is high and that pooling four samples per capture is a valid option. Overall, we have demonstrated the strong performance of this WES enrichment kit and are confident it will be a valuable tool in future disease association studies
An heuristic filtering tool to identify phenotype-associated genetic variants applied to human intellectual disability and canine coat colors
Background: Identification of one or several disease causing variant(s) from the large collection of variants present in an individual is often achieved by the sequential use of heuristic filters. The recent development of whole exome sequencing enrichment designs for several non-model species created the need for a species-independent, fast and versatile analysis tool, capable of tackling a wide variety of standard and more complex inheritance models. With this aim, we developed "Mendelian", an R-package that can be used for heuristic variant filtering.
Results: The R-package Mendelian offers fast and convenient filters to analyze putative variants for both recessive and dominant models of inheritance, with variable degrees of penetrance and detectance. Analysis of trios is supported. Filtering against variant databases and annotation of variants is also included. This package is not species specific and supports parallel computation. We validated this package by reanalyzing data from a whole exome sequencing experiment on intellectual disability in humans. In a second example, we identified the mutations responsible for coat color in the dog. This is the first example of whole exome sequencing without prior mapping in the dog.
Conclusion: We developed an R-package that enables the identification of disease-causing variants from the long list of variants called in sequencing experiments. The software and a detailed manual are available at https://github.com/BartBroeckx/Mendelian
Using linkage analysis of large pedigrees to guide association analyses
To date, genome-wide association studies have yielded discoveries of common variants that partly explain familial aggregation of diseases and traits. Researchers are now turning their attention to less common variants because the price of sequencing has dropped drastically. However, because sequencing of the whole genome in large samples is costly, great care must be taken to prioritize which samples and which genomic regions are selected for sequencing. We are interested in identifying genomic regions for deep sequencing using large multiplex families collected as part of earlier linkage studies. We incorporate linkage analysis into our search for Q1-associated alleles. Overall, we found that power was low for both whole-exome and linkage-guided sequencing analysis. By restricting sequencing to regions with high LOD peaks, we found fewer associated single-nucleotide polymorphisms than by using whole-exome sequencing. However, incorporating linkage analysis enabled us to detect more than half of the associated susceptibility loci (52%) that would have been identified by whole-exome sequencing while examining only 2.5% of the exome. This result suggests that incorporating linkage results from large multiplex families might greatly increase the efficiency of sequencing to detect trait-associated alleles in complex disease
Whole-Exome Sequencing Study of Thyrotropin-Secreting Pituitary Adenomas
ๅญฆไฝ่จ็ชๅท๏ผๅปๅ็ฒ169
ํฌ๊ท ์ ๊ฒฝ๊ทผ ์งํ์ ์ ์ ์ฒด, ์ ์ฌ์ฒด ํตํฉ ๋ถ์ ์ฐ๊ตฌ
ํ์๋
ผ๋ฌธ(์์ฌ)--์์ธ๋ํ๊ต ๋ํ์ :์๊ณผ๋ํ ์๊ณผํ๊ณผ,2019. 8. ์ต๋ฌด๋ฆผ.Whole exome sequencing (WES)์ ๋น์ฉ ๋ฐ ๋ฐ์ดํฐ ์ฒ๋ฆฌ์ ์ฉ์ด์ฑ์ผ๋ก ์ธํ์ฌ ํฌ๊ท์งํ ์ง๋จ๋ฑ์ ๋งค์ฐ ํจ๊ณผ์ ์ธ ๋ฐฉ๋ฒ์ด ๋์๋ค. ๊ทธ๋ฌ๋ variant of unknown significances (VUS)๋ฅผ ํด์ํ๋ ์ด๋ ค์๊ณผnon-coding ๋ณ์ดํ์ ํ์ธํ ์ ์๋ค๋ ์ ๋ฑ์ ์ด์ ๋ก WES ๊ธฐ๋ฐ์ ํฌ๊ท์งํ ์ง๋จ๋ฅ ์ ๋๋ถ๋ถ 50%๋ฅผ ๋์ง ๋ชปํ๋ค. ๋ฐ๋ผ์, ๋ณธ ์ฐ๊ตฌ์์๋ ํฌ๊ท์งํ ์ง๋จ์ ๋ณด์์ ์ธ ์ ๊ทผ๋ฒ์ผ๋ก ์๋ก์ด ์ ์ฌ์ฒด ๋ถ์๋ฒ์ ๋์
ํ ๊ฒ์ ์ ์ํ๊ณ ์ ํ๋ค. ์ด๋ฅผ ์ํ์ฌ ์์ธ๋ํ๊ต ์ด๋ฆฐ์ด๋ณ์ ์์์ ๊ฒฝ๊ณผ์์ ์์์ ์ผ๋ก ์ง๋จ๋์ง ๋ชปํ ๊ทผ์ ๊ฒฝ์งํ ํ์ 94 ๋ช
์ ๋์์ผ๋ก WES ๋ถ์์ ์ค์ํ๊ณ , ์ด๋ฏธ ์๋ ค์ง ๊ทผ์ ๊ฒฝ์งํ์ ์์ธ ์ ์ ์ ๋ณ์ด๋ค์ ๋ถ์ํ์๋ค. ์ถ๊ฐ์ ์ผ๋ก, ๊ธฐ์กด์ WES ๋ถ์์ด ์ํ๋ 63๋ช
์ ํ์๊ตฐ๊ณผ ์ด ์ธ์ 10๋ช
์ ํ์๊ตฐ์ ์ถ๊ฐํ์ฌ ์ ์ฌ์ฒด ๋ถ์์ ์ํํ์๋ค. ์ ์ฌ์ฒด ๋ฐ์ดํฐ๋ฅผ ์ด์ฉํ์ฌ damaging ๋ณ์ด ๋ถ์, allele-specific expression ๋ถ์, ํ์๊ตฐ๊ณผ ์ ์๊ตฐ์์ ๋ค๋ฅด๊ฒ ๋ฐํํ๋ ์ ์ ์ (DEG) ๋ฐ ๋น์ ์์ ์ธ splicing ์์์ ํ์ํ๋ ๋ถ์์ ์ํํ์๋ค. ๋ํ, non-negative matrix factorization ๋ถ์ ๊ธฐ๋ฒ์ ํตํด ์ ์ ์ ๋ฐํ ํ๋กํ์ผ์ ๊ธฐ๋ฐ์ผ๋ก ํ ๊ตฐ์งํ๋ฅผ ์ํํ๊ณ , ๊ฐ ๊ตฐ์ง์ ํน์ง ์ง๋ ์ ์ ์ ๊ทธ๋ฃน์ ๋์ถํ์๋ค. ๊ทธ ๊ฒฐ๊ณผ, WES ๋ถ์์ ํตํ์ฌ 49%์ ํ์์์ ํ๋ณด ์์ธ ๋ณ์ด๋ฅผ ํ์ธํ์์ผ๋ฉฐ, ๊ทธ ์ค 83%์ ํ์์์๋ ์๋ ค์ง ๊ทผ์ ๊ฒฝ์งํ ์์ธ ์ ์ ์์ ๋ณ์ด๋ฅผ ํ์ธํ์๋ค. 12๋ช
์ ํ์์์๋ ๊ทธ ๊ธฐ๋ฅ์ฑ์ด ํ์คํ์ง ์์ ๊ตฌ์กฐ ๋ณ์ด๋ฅผ ํ์ธํ์๋ค. ์ ์ฌ์ฒด ๋ฐ์ดํฐ ๊ธฐ๋ฐ์ ๋ณ์ด ๋ถ์์ ํตํ์ฌ, WES ์ ์ํํ์ง ์์ 5 ๋ช
์ ํ์๋ฅผ ํฌํจํ ์ด 9 ๋ช
์ ํ์์์ heterozygous ๋ณ์ด๋ฅผ ์ถ๊ฐ๋ก ๋ฐ๊ฒฌํ์๋ค. Allele-specific expression ๋ถ์์ ํตํ์ฌ 2๊ฐ์ ํ๋ณด ์์ธ์ ์ ์๋ฅผ ๋ฐ๊ฒฌํ์๊ณ , DEG ๋ถ์ ๊ฒฐ๊ณผ, 4๋ช
์ ํ์์์ ์ ์ฌ์ ์ธ ์์ธ ์ ์ ์ ๊ทธ๋ฃน์ ์ ๋ณํ ์ ์์๋ค. ๋ํ, 4 ๋ช
์ ํ์์๊ฒ์ DMD, TTN, MICU1 ์ ์ ์๋ค์ ๋น์ ์์ ์ธ splicing์ด ํ์ธ๋์๋ค. non-negative matrix factorization ๊ธฐ๋ฐ ๊ตฐ์งํ ๋ถ์ ๊ฒฐ๊ณผ, ์ ์ ์ ๋ฐํ ์์์ ๊ธฐ๋ฐ์ผ๋ก ํ 6๊ฐ์ ๊ตฐ์ง์ ํ์ธํ ์ ์์๋ค. ๋ณธ ์ฐ๊ตฌ๋ฅผ ํตํ์ฌ ์ ์ฌ์ฒด ๋ถ์๋ฒ์ด ๊ธฐ์กด์ WES ๊ธฐ๋ฒ ๊ธฐ๋ฐ ๋ถ์์ ํจ๊ณผ์ ์ธ ๋ณด์ ๊ธฐ๋ฒ์ด ๋ ์ง์ ์ฌ๋ถ๋ฅผ ํ์ธํ๊ณ ์ ํ์๋ค. ์ ์ฌ์ฒด ๋ถ์ ๊ฒฐ๊ณผ, WES ๊ธฐ๋ฒ์ ํตํด ์์ธ ์ ์ ์ ๋ณ์ด๊ฐ ํ์ธ๋ ํ์๋ค ์ค 9๋ช
์๊ฒ์ ๊ฐ์ ๋งฅ๋ฝ์ ์ ์ฌ์ฒด ์ด์์ ํ์ธํ ์ ์์์ผ๋ฉฐ, WES์ ์ํํ์ง ์์ ํ์๋ค ์ค 18๋ช
์๊ฒ์๋ ์ ์ฌ์ ์ธ ์์ธ ์ ์ ์ ๋ณ์ด๋ฅผ ํ์ธํ์๋ค. ๋ฐ๋ผ์ ์ ์ฌ์ฒด ๋ถ์๋ฒ์ ๊ธฐ์กด์ ๋ถ์๊ธฐ๋ฒ์ผ๋ก ์์ธ ์ ์ ์ ๋ณ์ด๋ฅผ ๋ฐ๊ฒฌํ ์ ์๋ ์ฆ๋ก์ ์ง๋จ์ ์ ์ฉํ ๋๊ตฌ๋ก ์ฌ์ฉ๋ ์ ์์์ ์์ฌํ๋ค.Introduction. Whole exome sequencing has become a robust and standard tool for rare diseases diagnosis thanks to advantages in cost and data handling. However, whole exome sequencing-based diagnosis rates typically do not exceed 50%, which can be attributed to the difficulty of interpreting variants of uncertain significance, as well as to the disregard of non-coding variants, including variants in intronic and regulatory regions in the genome. Therefore, I explored the utility of transcriptome sequencing as a compensatory approach in rare neuromuscular disorders diagnosis.
Methods. Whole exome sequencing of 94 patients with undiagnosed neuromuscular disorders was collected from Seoul National University Childrens Hospital and analyzed for variants in known neuromuscular disease genes. Additional transcriptome sequencing was performed for 63 of the whole exome sequenced patients and for ten patients without genome data. Transcriptome data were utilized for cryptic damaging variants, differentially expression, aberrant splicing and allele specific expression analysis. Furthermore, non-negative matrix factorization was applied to identify expression-based clustering and cluster-specific gene ontology was derived.
Results. Whole exome sequencing analysis identified candidate variants in 49% of patients, with 83% of them located within known disease genes. Structural variants with questionable pathogenicity were discovered in twelve cases. RNA-Sequencing based variant calling lead to further discovery of heterozygous candidate variants in nine samples, five of which did not undergo whole exome sequencing. Allele specific expression identified two likely candidate genes and differential gene expression analysis lead to the prioritization of sets of genes in an additional four samples. Lastly, aberrant splicing of DMD, TTN and MICU1 was detected in each of four samples. Non-negative matrix factorization-based clustering resulted in the identification of six clusters with distinct gene expression profiles.
Discussion. Firstly, I aimed to evaluate whether transcriptome sequencing can provide additional evidence for the interpretation of whole exome sequencing variants. Overall, transcriptome sequencing was able to detect abnormalities associated with the previously identified mutation in less than 30% of positive whole exome sequencing cases. For samples without whole exome sequencing result, I successfully used transcriptome sequencing to identify potential pathogenic causes in 18 cases. In conclusion, transcriptome sequencing proved to be a useful tool for the diagnosis of whole exome sequencing negative samples, but did not prove to have great utility for the interpretation of pathogenic whole exome sequencing variants.1. INTRODUCTION.....................................................................................1
1.1. Advancement through next generation sequencing...................1
1.2. Genetics of neuromuscular disorders (NMD)..............................3
1.3. Transcriptome sequencing-based NMD diagnosis.......................8
2. METHODS............................................................................................12
2.1. Data collection.........................................................................12
2.2. Whole exome sequencing data analysis....................................13
2.3. Transcriptome sequencing analysis...........................................15
2.4. Non-negative matrix factorization based clustering...................19
3. RESULTS...............................................................................................22
3.1. Data collection.........................................................................22
3.2. Phenotype information.............................................................23
3.3. Whole exome sequencing results..............................................25
3.4. Transcriptome sequencing quality control..................................28
3.5. Transcriptome-based clustering.................................................31
3.6. Exome variants in transcriptome sequencing.............................35
3.7. Transcriptome-sequencing based diagnosis...............................39
4. DISCUSSION..........................................................................................48
5. REFERENCES.........................................................................................57
6. APPENDIX.............................................................................................63
6.1. Supplementary Figures..............................................................63
6.2. Supplementary Tables................................................................67
7. ๊ตญ๋ฌธ์ด๋ก.................................................................................................71Maste
Analysis of Archived Residual Newborn Screening Blood Spots After Whole Genome Amplification
Deidentified newborn screening bloodspot samples (NBS) represent a valuable potential resource for genomic research if impediments to whole exome sequencing of NBS deoxyribonucleic acid (DNA), including the small amount of genomic DNA in NBS material, can be overcome. For instance, genomic analysis of NBS could be used to define allele frequencies of disease-associated variants in local populations, or to conduct prospective or retrospective studies relating genomic variation to disease emergence in pediatric populations over time. In this study, we compared the recovery of variant calls from exome sequences of amplified NBS genomic DNA to variant calls from exome sequencing of non-amplified NBS DNA from the same individuals. Results: Using a standard alignment-based Genome Analysis Toolkit (GATK), we find 62,000-76,000 additional variants in amplified samples. After application of a unique kmer enumeration and variant detection method (RUFUS), only 38,000-47,000 additional variants are observed in amplified gDNA. This result suggests that roughly half of the amplification-introduced variants identified using GATK may be the result of mapping errors and read misalignment. Conclusions: Our results show that it is possible to obtain informative, high-quality data from exome analysis of whole genome amplified NBS with the important caveat that different data generation and analysis methods can affect variant detection accuracy, and the concordance of variant calls in whole-genome amplified and non-amplified exomes.National Institute of Health P01HD067244, NS076465, R01ES021006Nutritional Science
- โฆ