Search CORE

1,441 research outputs found

Identifying micro-inversions using high-throughput sequencing reads

Author: A Abyzov
B Chabrol
B Langmead
B Langmead
C Shen
D Botstein
EL Braun
ENCODE Project Consortium
ER Mardis
Feifei He
GR Abecasis
H Li
H Li
HJ Abel
Huaiqiu Zhu
J Barretina
J Harrow
J Jaeken
J Ma
J Wang
Jian Ma
K Chen
K Trappe
K Ye
L Feuk
L Siggens
M Baker
MJ Chaisson
P Medvedev
RJ Klose
S Lew
S Suzuki
T Rausch
Y Jiang
Yang Li
Yu-Hang Tang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Background: The identification of inversions of DNA segments shorter than read length (e.g., 100 bp), defined as micro-inversions (MIs), remains challenging for next-generation sequencing reads. It is acknowledged that MIs are important genomic variation and may play roles in causing genetic disease. However, current alignment methods are generally insensitive to detect MIs. Here we develop a novel tool, MID (Micro-Inversion Detector), to identify MIs in human genomes using next-generation sequencing reads. Results: The algorithm of MID is designed based on a dynamic programming path-finding approach. What makes MID different from other variant detection tools is that MID can handle small MIs and multiple breakpoints within an unmapped read. Moreover, MID improves reliability in low coverage data by integrating multiple samples. Our evaluation demonstrated that MID outperforms Gustaf, which can currently detect inversions from 30 bp to 500 bp. Conclusions: To our knowledge, MID is the first method that can efficiently and reliably identify MIs from unmapped short next-generation sequencing reads. MID is reliable on low coverage data, which is suitable for large-scale projects such as the 1000 Genomes Project (1KGP). MID identified previously unknown MIs from the 1KGP that overlap with genes and regulatory elements in the human genome. We also identified MIs in cancer cell lines from Cancer Cell Line Encyclopedia (CCLE). Therefore our tool is expected to be useful to improve the study of MIs as a type of genetic variant in the human genome. The source code can be downloaded from: http://cqb.pku.edu.cn/ZhuLab/MID.NCI NIH HHS [CA182360, R33 CA182360]; NHGRI NIH HHS [HG007352, R01 HG007352]SCI(E)PubMedARTICLESuppl 141

Crossref

Springer - Publisher Connector

PubMed Central

eScholarship - University of California

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Ding Li
Publication venue: Digital Commons@Becker
Publication date: 01/01/2019
Field of study

Digital Commons@Becker

Detection of Genomic Inversion from Single End Read

Author: Ghimire Pankaj
Publication venue: OpenCommons@UConn
Publication date: 16/12/2012
Field of study

Structural Variations (SVs) are genomic rearrangements that include both copy-number variants,such as insertion,deletions, duplications and balanced variants like inversion and translocations. These SVs are getting more attentions for research and investigation because of their role on human phenotype, genetic diseases and genomic rearrangements. Evolution of Next-generation Sequencing has provided golden opportunities to investigate these variants and make their wider and clear spectrum in human genome. This investigation includes identification of type of SVs and their breakpoints at base pair level. For their effective identification and breakpoint resolution, many techniques are devised mainly based on paired end read. With relatively low cost and high efficiency different platforms including ION TORRENT, Illumina can generate high throughput Single End reads. In this thesis we provide a novel approach based on Single End reads to detect genomic inversions in human genome. We also compare our approach with existing methods based on paired end reads and show that our approach is competitive in terms of sensitivity and precision at relatively low coverage for detection of breakpoints of genomic inversion

DigitalCommons@UConn

OpenCommons at University of Connecticut

Genomic approaches to understanding population divergence and speciation in birds

Author: Balakrishnan Christopher N.
Baldassarre Daniel T.
Campagna Leonardo
Deane-Coe Petra E.
Harvey Michael G.
Hooper Daniel M.
Irwin Darren E.
Judy Caroline D.
Mason Nicholas A.
McCormack John E.
McCracken Kevin G.
Oliveros Carl H.
Safran Rebecca J.
Scordato Elizabeth S.C.
Stryjewski Katherine Faust
Taylor Scott A.
Tigano Anna
Toews David P.L.
Uy J. Albert C.
Winger Benjamin M.
Publication venue: LSU Digital Commons
Publication date: 01/10/2015
Field of study

© 2016 American Ornithologists\u27 Union. The widespread application of high-throughput sequencing in studying evolutionary processes and patterns of diversification has led to many important discoveries. However, the barriers to utilizing these technologies and interpreting the resulting data can be daunting for first-time users. We provide an overview and a brief primer of relevant methods (e.g., whole-genome sequencing, reduced-representation sequencing, sequence-capture methods, and RNA sequencing), as well as important steps in the analysis pipelines (e.g., loci clustering, variant calling, whole-genome and transcriptome assembly). We also review a number of applications in which researchers have used these technologies to address questions related to avian systems. We highlight how genomic tools are advancing research by discussing their contributions to 3 important facets of avian evolutionary history. We focus on (1) general inferences about biogeography and biogeographic history, (2) patterns of gene flow and isolation upon secondary contact and hybridization, and (3) quantifying levels of genomic divergence between closely related taxa. We find that in many cases, high-throughput sequencing data confirms previous work from traditional molecular markers, although there are examples in which genome-wide genetic markers provide a different biological interpretation. We also discuss how these new data allow researchers to address entirely novel questions, and conclude by outlining a number of intellectual and methodological challenges as the genomics era moves forward

Louisiana State University

University of Miami: Scholarship Miami

Characterising chromosome rearrangements: recent technical advances in molecular cytogenetics

Author: A Abyzov
A Kallioniemi
A Tzschach
AB Olshen
AE Dellinger
AJ Coffey
AJ Iafrate
AJ Sharp
AJ Sharp
AM Snijders
AS Ishkanian
AW Pang
B Carvalho
BA Talseth-Palmer
BM Skinner
C Alkan
C Alkan
C Alkan
C Brennan
C Curtis
C Fauth
C Le Caignec
CA Maher
CD Greenman
D Pinkel
D Pinkel
D Pinto
DA Peiffer
DF Conrad
DF Conrad
DJ Hedges
DK Griffin
DR Bentley
DT Miller
DY Chiang
E Darai-Ramqvist
E Przybytkowski
E Schrock
E Tuzun
ED Pleasance
EF Nuwaysir
ER Mardis
ES Lander
EV Linardopoulou
GC Kennedy
GM Cooper
GM Cooper
GR Bignell
H Fiegler
H Fiegler
H Fiegler
H Li
H Park
H Telenius
HC Mefford
HH Heng
HJ Abel
I Bieche
I Parra
I Slade
J Huang
J Sebat
J Shendure
J Wiegant
JC Marioni
JC Venter
JG Bauman
JM Kidd
JO Korbel
JP Schouten
JR Pollack
K Yamazawa
K Ye
KD Howarth
KD Howarth
KK Mantripragada
KL Gunderson
L Backx
L Feuk
L Winchester
LK Conlin
M Guillaud-Bataille
M Meyerson
M Simonis
M Volker
MA Heiskanen
ML Metzker
MR Speicher
MR Speicher
MT Barrett
N Craddock
N Huang
NA Yamada
NP Carter
P Dhami
P Lichter
P Medvedev
P Parameswaran
P Stankiewicz
PH Sudmant
PJ Campbell
PJ Campbell
PJ Stephens
PJ Stephens
R Andersson
R Pique-Regi
R Redon
RE Mills
RE Mills
RM Durbin
RS Mani
S Le Scouarnec
S M Gribble
S Solinas-Toldo
S Yoon
SA McCarroll
SM Gribble
SM Gribble
SM Gribble
SM Gribble
SW Scherer
T Cremer
T LaFramboise
The International HapMap Consortium
TS Price
TW Fitzgerald
W Chen
W Chen
W Gu
X Michalet
YJ Chung
Z Ou
Publication venue: Nature Publishing Group
Publication date
Field of study

Genomic rearrangements can result in losses, amplifications, translocations and inversions of DNA fragments thereby modifying genome architecture, and potentially having clinical consequences. Many genomic disorders caused by structural variation have initially been uncovered by early cytogenetic methods. The last decade has seen significant progression in molecular cytogenetic techniques, allowing rapid and precise detection of structural rearrangements on a whole-genome scale. The high resolution attainable with these recently developed techniques has also uncovered the role of structural variants in normal genetic variation alongside single-nucleotide polymorphisms (SNPs). We describe how array-based comparative genomic hybridisation, SNP arrays, array painting and next-generation sequencing analytical methods (read depth, read pair and split read) allow the extensive characterisation of chromosome rearrangements in human genomes

Crossref

PubMed Central

Multi-platform discovery of haplotype-resolved structural variation in human genomes

Author: Guryev Victor
Lansdorp Peter
Porubský David
Spierings Diana
Publication venue
Publication date: 23/09/2017
Field of study

ARTS repository - University of Groningen

유전체 및 전사체 분석을 활용한 항암제(MTX) 내성 HT-29 세포주의 tandem DHFR 유전자 증폭 특성 및 기전 연구

Author: 김아름
Publication venue: 서울대학교 대학원
Publication date: 01/02/2019
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 의과대학 의과학과, 2019. 2. 김종일.The massively parallel sequencing technology known as next-generation sequencing (NGS) has been currently developed and evolved for cancer genome research to obtain the molecular microscope findings and treatment of disease. The time and cost for NGS analysis have been greatly reduced, so the mechanisms from the basic mechanism of human evolution to the complicated mechanism underlying how genetic changes have driven the resistance of cancer cells under anti-cancer drugs have been comprehensively investigated through advancements in NGS technologies. Therefore, the combination of these NGS technologies has contributed to cancer research such as diagnosis, management, and treatment by identifying and elucidating the molecular tumor profiling and it would play an important role in the future of cancer treatment and of personalized medicine in cancer research. DHFR gene amplification is present in methotrexate (MTX) resistant colon cancer cells and in acute lymphoblastic leukemia. The region of chromosome 5q14 contains many genes as well as DHFR gene, and little is known about DHFR gene amplification at this position since quantifying amplification size and recognizing the involved repetitive rearrangements in gene amplification position require extra time and efforts with limited technologies and bioinformatics. Also, there is no clear way to assemble the complete structure of the amplified region with short read (read length repeat length), which provide exceptionally long read lengths, have the potential to overcome these limitations and allow for complete assembly of the region. Here I have proposed an integrative framework to quantify the amplified region and detect structural variations, which are large, complex DNA segments involving repeats by using a combination of technologies, including single molecule real-times sequencing, next generation optical mapping, and high throughput chromosome conformation capture (Hi-C). The amplification units of 11 genes from DHFR gene to ATP6AP1L gene position on chromosome 5 (~2.2Mbp) and tandem gene amplification about twentyfold longer amplified region than control have been identified by several NGS technologies such as optical mapping and single molecule real-times sequencing, and its abnormally increased expression and complicated splicing patterns were characterized by RNA sequencing data. The novel inversion (chr5:80,618,750-80,631,409) at the DHFR gene of amplified region was detected which might stimulate chromosomal breakage for gene amplification Using Hi-C technology, the high adjusted interaction frequencies which indicated the inter-chromosomal contact and significant adjusted p-value were detected on the amplified unit and unsuspected position on 5q in MTX resistant HT-29 sample compared to control. It might explain that chromosomal structure from the start position of the amplified unit (80.6Mb - 82.8Mb) to end of 5q (109Mb-138Mb) could have the complex network of spatial contacts to harbor the gene amplification. Also, the increased relative copy number, the several newly identified topologically associating domains (TADs), and extrachromosomal double minutes (DMs) on this amplified region, which were not detected by other technologies, were identified and described for finding the association with the gene amplification mechanism. Interestingly, the novel frameshift insertions in most of MSH and MLH genes were identified, which could cause the dysregulation of mismatch repair pathway under MTX condition and play an important role on the rapid progression of gene amplification as well as being resistant to MTX. Considering the several characteristics of variable size of tandem gene amplification patterns with homogeneously staining chromosome regions (HSRs), extrachromosomal DM suggested that the gene amplification might be produced from the Breakage-fusion-bridge (BFB) cycles. Overall, the characterized tandem gene amplified unit, more complicated interaction on intra-chromosome 5, inversion of the amplification unit as well as the mutations in MSH and MLH genes can be the critical factor for identifying the mechanism of genomic rearrangements, and these findings may give new insight into the mechanism underlying the amplification process and evolution of resistance to drugs. Therefore, the comprehensive approach of combined advanced technologies is a powerful tool for interpretation of cancer genomes, and this will provide the depth of insight to identify the most important therapeutic mechanism and new targets of the anti-cancer drug.차세대 시퀀싱 (next generation sequencingNGS)으로 알려진 대량 병렬 시퀀싱 기술은 암 유전체 내의 질병의 분자 현미경 수준의 새로운 발견 및 치료법을 얻기 위해 개발되고 발전해 왔다. 현재 차세대 시퀀싱 분석을 위한 시간과 비용이 크게 줄어들었으며, 인간 진화의 기본 메커니즘에서 항암제 내성을 보이는 암 세포의 유전자 변형에 관련된 복잡한 메커니즘에 이르기까지 차세대 시퀀싱 분석의 발전을 통하여 종합적으로 분석되어왔다. 따라서 이러한 차세대 시퀀싱 분석 기술들의 조합은 분자 수준의 종양 프로파일을 규명하고 밝혀줌으로써 진단, 관리 및 치료를 위한 암 연구에 기여했으며, 암 치료 및 암 연구에서의 맞춤 의학의 미래에 중요한 역할을 할 것이다. DHFR 유전자 증폭 현상은 항암제 매토트렉세이트(methotrexateMTX)에 내성을 보이는 결장암 세포에 존재하며 또한 급성 림프 구성 백혈병에 존재한다. 5q14 염색체의 영역은 많은 유전자를 포함하고 있으며 대장 암 세포가 매토트렉세이트 상태에서 저항을 보일 때 유전자 증폭 현상의 근원이 되는 것으로 알려져 있으나, 실제 유전체의 변화에 대해서는 거의 알려져 있지 않았다. 이전에는 짧은 염기 서열 분석 기술을 사용해서 분석하였지만, 제공된 짧은 서열은 반복서열 영역 (repetitive region)을 분석 할 수 없고 접합 서열 (junction reads)를 식별 할 수 없기 때문에 증폭 된 영역의 전체 구조를 조합 (assemble) 할 명확한 방법이 없었다. 예외적으로 긴 서열을 제공하는 단일 분자 실시간 (PacBio SMRT) 시퀀싱은 이러한 한계를 극복하고 반복 영역의 유전체 서열의 완벽한 조립 (assembly) 을 가능하게 한다. 본 연구에서는 단일 분자 실시간 시퀀싱, 차세대 제한효소 광학 지도 (next generation optical mapping) 및 DNA의 3차원(3D) 구성을 측정하는 분석법 (high throughput chromosome conformation captureHi-C )과 같은 새로운 유전자 분석 기술을 사용하여 메토트렉세이트에 내성을 보이는 결장암 세포주(HT-29)내의 유전체 복제 과정을 파악하였고, 크고 복잡한 DNA 단편을 갖는 반복 서열의 구조적 변이(structural variations)를 검출하는 통합적인 프레임워크를 제안하였다. 단일 분자 실시간 시퀀싱과 광학 지도를 활용하여, 유전체 반복서열을 완벽하게 조립하고자 하였고, 5번 염색체의 DHFR 유전자에서 ATP6AP1L 유전자까지 2.2Mbp에 이르는 11 개의 유전자가 복제 단위이자 그 유전자들이 그 일렬 순서대로 대조군에 비해 20배 정도 길게 복제됨을 확인하였다. 또한, 유전자 발현량 및 RNA 유전자 접합 패턴(splicing pattern)을 대조군과 비교 분석한 결과, 유전체 복제 단위에서 작게는 5배에서 크게는 122배까지 비정상적인 유전자 발현량이 측정되었으며, 복잡한 RNA접합 패턴이 동반되는 것을 확인하였다. 또한, 염색체 구조를 파악하는 DNA의 3차원(3D) 구성을 측정한 분석 결과를 토대로, 염색체 내의 유전자가 얼마만큼 상호 작용을 하는가 확인하였을 때, 대조군에 비하여 몇몇의 위상 학적 연관 도메인 (topologically associating domainsTADs)이 매토트렉세이트에 내성을 지신 결장암 세포주(HT-29)의 유전자가 증폭된 영역의 중앙 및 종단점에서 새롭게 발견되었으며, 이 부분에서는 조정된 상호 작용 정도 값이 높고, 그 값이 통계학적으로 유의함(p<0.05)을 확인하였다. 더불어, 발견하기 힘든 이중극미염색체(double minute)가 발견되었다. 흥미롭게도, MSH와 MLH 유전자의 틀이동 삽입 돌연변이 (frameshift insertion)가 매토트렉세이트 (methotrexate) 조건 하에서 염기 쌍의 잘못 짝지움을 수복하는 분자기전(mismatch repair pathway)의 유전적 불안정성과 조절 장애를 일으켰으며, DHFR 유전자 위치에서 역위되어 중복된 경우(inverted duplication)으로 인해 5번 염색체 상의 DHFR 유전자 위치에서 염색체 절단(chromosome breakage)이 발생하였고, 다양한 크기의 유전자가 증폭된 균질염색부위(homogeneously staining regionHSR)가 절단융합가교환(breakage-fusion-bridge cycleBFB cycle)로 생산됨을 유추할 수 있었다. 종합적으로, 본 연구는 5번 염색체 내에서의 보다 복잡한 염색체 상호 작용 및 복제 단위 내의 역위는 유전체 재배열 (genomic rearrangement) 의 기전을 확인하는 중요한 요소가 될 수 있으며, 이러한 발견은 유전자 증폭 과정의 기초가 되는 메커니즘뿐만 아니라 암세포의 항암제 내성 원리에 대한 새로운 통찰력을 제공 할 수 있을 것이라 판단하였다. 따라서 차세대 염기 분석법과 다양한 새로운 첨단 기술을 결합한 분석법은 암 유전체의 해석을 위한 강력한 도구이며, 암 치료의 핵심적인 치료 메커니즘을 파악하여 항암제의 새로운 목표를 설정할 수 있다는 점에서 정밀의학의 발전에 큰 영향을 미칠 것으로 기대한다.Abstract i Contents vi List of Tables vii List of Figures ix List of Abbreviations xiii Introduction 1 Material and Methods 6 Results 28 Discussion 87 References 96 Abstract in Korean 106Docto

SNU Open Repository and Archive

A Review of Copy Number Variants in Inherited Neuropathies

Author: Efthymiou S
Houlden H
Manole A
Salpietro V
Publication venue: BENTHAM SCIENCE PUBL LTD
Publication date: 02/07/2018
Field of study

The rapid development in the last 10-15 years of microarray technologies, such as oligonucleotide array Comparative Genomic Hybridization (CGH) and Single Nucleotide Polymorphisms (SNP) genotyping array, has improved the identification of fine chromosomal structural variants, ranging in length from kilobases (kb) to megabases (Mb), as an important cause of genetic differences among healthy individuals and also as disease-susceptibility and/or disease-causing factors. Structural genomic variations due to unbalanced chromosomal rearrangements are known as Copy-Number Variants (CNVs) and these include variably sized deletions, duplications, triplications and translocations. CNVs can significantly contribute to human diseases and rearrangements in several dosagesensitive genes have been identified as an important causative mechanism in the molecular aetiology of Charcot-Marie-Tooth (CMT) disease and of several CMT-related disorders, a group of inherited neuropathies with a broad range of clinical phenotypes, inheritance patterns and causative genes. Duplications or deletions of the dosage-sensitive gene PMP22 mapped to chromosome 17p12 represent the most frequent causes of CMT type 1A and Hereditary Neuropathy with liability to Pressure Palsies (HNPP), respectively. Additionally, CNVs have been identified in patients with other CMT types (e.g., CMT1X, CMT1B, CMT4D) and different hereditary poly- (e.g., giant axonal neuropathy) and focal- (e.g., hereditary neuralgic amyotrophy) neuropathies, supporting the notion of hereditary peripheral nerve diseases as possible genomic disorders and making crucial the identification of fine chromosomal rearrangements in the molecular assessment of such patients. Notably, the application of advanced computational tools in the analysis of Next-Generation Sequencing (NGS) data has emerged in recent years as a powerful technique for identifying a genome-wide scale complex structural variants (e.g., as the ones resulted from balanced rearrangements) and also smaller pathogenic (intragenic) CNVs that often remain beyond the detection limit of most conventional genomic microarray analyses; in the context of inherited neuropathies where more than 70 disease-causing genes have been identified to date, NGS and particularly Whole-Genome Sequencing (WGS) hold the potential to reduce the number of genomic assays required per patient to reach a diagnosis, analyzing with a single test all the Single Nucleotide Variants (SNVs) and CNVs in the genes possibly implicated in this heterogeneous group of disorders

UCL Discovery