16,457 research outputs found

    InDel markers: An extended marker resource for molecular breeding in chickpea

    Get PDF
    Chickpea is one of the most important food legumes that holds the key to meet rising global food and nutritional demand. In order to deploy molecular breeding approaches in crop improvement programs, user friendly and cost effective marker resources remain prerequisite. The advent of next generation sequencing (NGS) technology has resulted in the generation of several thousands of markers as part of several large scale genome sequencing and re-sequencing initiatives. Very recently, PCR based Insertion-deletions (InDels) are becoming a popular gel based genotyping solution because of their co-dominant, inexpensive, and highly polymorphic nature. With an objective to expand marker resources for genomics assisted breeding (GAB) in chickpea, whole genome re-sequencing data generated on five parental lines of one interspecific (ICC 4958 × PI 489777) and two intra-specific (ICC 283 × ICC 8261 and ICC 4958 × ICC 1882) mapping populations, were used for identification of InDels. A total of 231,658 InDels were identified using Dindel software with default parameters. Further, a total of 8,307 InDels with ≥20 bp size were selected for development of gel based markers, of which primers could be designed for 7,523 (90.56%) markers. On average, markers appeared at a frequency of 1,038 InDels/LG with a maximum number of markers on CaLG04 (1,952 InDels) and minimum on CaLG08 (360 InDels). In order to validate these InDels, a total of 423 primer pairs were randomly selected and tested on the selected parental lines. A high amplification rate of 80% was observed ranging from 46.06 to 58.01% polymorphism rate across parents on 3% agarose gel. This study clearly reflects the usefulness of available sequence data for the development of genome-wide InDels in chickpea that can further contribute and accelerate a wide range of genetic and molecular breeding activities in chickpea

    A cancer cell-line titration series for evaluating somatic classification.

    Get PDF
    BackgroundAccurate detection of somatic single nucleotide variants and small insertions and deletions from DNA sequencing experiments of tumour-normal pairs is a challenging task. Tumour samples are often contaminated with normal cells confounding the available evidence for the somatic variants. Furthermore, tumours are heterogeneous so sub-clonal variants are observed at reduced allele frequencies. We present here a cell-line titration series dataset that can be used to evaluate somatic variant calling pipelines with the goal of reliably calling true somatic mutations at low allele frequencies.ResultsCell-line DNA was mixed with matched normal DNA at 8 different ratios to generate samples with known tumour cellularities, and exome sequenced on Illumina HiSeq to depths of >300×. The data was processed with several different variant calling pipelines and verification experiments were performed to assay >1500 somatic variant candidates using Ion Torrent PGM as an orthogonal technology. By examining the variants called at varying cellularities and depths of coverage, we show that the best performing pipelines are able to maintain a high level of precision at any cellularity. In addition, we estimate the number of true somatic variants undetected as cellularity and coverage decrease.ConclusionsOur cell-line titration series dataset, along with the associated verification results, was effective for this evaluation and will serve as a valuable dataset for future somatic calling algorithm development. The data is available for further analysis at the European Genome-phenome Archive under accession number EGAS00001001016. Data access requires registration through the International Cancer Genome Consortium's Data Access Compliance Office (ICGC DACO)

    Towards Better Understanding of Artifacts in Variant Calling from High-Coverage Samples

    Full text link
    Motivation: Whole-genome high-coverage sequencing has been widely used for personal and cancer genomics as well as in various research areas. However, in the lack of an unbiased whole-genome truth set, the global error rate of variant calls and the leading causal artifacts still remain unclear even given the great efforts in the evaluation of variant calling methods. Results: We made ten SNP and INDEL call sets with two read mappers and five variant callers, both on a haploid human genome and a diploid genome at a similar coverage. By investigating false heterozygous calls in the haploid genome, we identified the erroneous realignment in low-complexity regions and the incomplete reference genome with respect to the sample as the two major sources of errors, which press for continued improvements in these two areas. We estimated that the error rate of raw genotype calls is as high as 1 in 10-15kb, but the error rate of post-filtered calls is reduced to 1 in 100-200kb without significant compromise on the sensitivity. Availability: BWA-MEM alignment: http://bit.ly/1g8XqRt; Scripts: https://github.com/lh3/varcmp; Additional data: https://figshare.com/articles/Towards_better_understanding_of_artifacts_in_variating_calling_from_high_coverage_samples/981073Comment: Published versio

    Rapid detection of copy number variations and point mutations in BRCA1/2 genes using a single workflow by ion semiconductor sequencing pipeline

    Get PDF
    Molecular analysis of BRCA1 (MIM# 604370) and BRCA2 (MIM #600185) genes is essential for familial breast and ovarian cancer prevention and treatment. An efficient, rapid, cost-effective accurate strategy for the detection of pathogenic variants is crucial. Mutations detection of BRCA1/2 genes includes screening for single nucleotide variants (SNVs), small insertions or deletions (indels), and Copy Number Variations (CNVs). Sanger sequencing is unable to identify CNVs and therefore Multiplex Ligation Probe amplification (MLPA) or Multiplex Amplicon Quantification (MAQ) is used to complete the BRCA1/2 genes analysis. The rapid evolution of Next Generation Sequencing (NGS) technologies allows the search for point mutations and CNVs with a single platform and workflow. In this study we test the possibilities of NGS technology to simultaneously detect point mutations and CNVs in BRCA1/2 genes, using the OncomineTM BRCA Research Assay on Personal Genome Machine (PGM) Platform with Ion Reporter Software for sequencing data analysis (Thermo Fisher Scientific). Comparison between the NGS-CNVs, MLPA and MAQ results shows how the NGS approach is the most complete and fast method for the simultaneous detection of all BRCA mutations, avoiding the usual time consuming multistep approach in the routine diagnostic testing of hereditary breast and ovarian cancers

    The South Asian genome

    Get PDF
    Genetics of disease Microarrays Variant genotypes Population genetics Sequence alignment AllelesThe genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.Whole genome sequencing to discover genetic variants underlying type-2 diabetes, coronary heart disease and related phenotypes amongst Indian Asians. Imperial College Healthcare NHS Trust cBRC 2011-13 (JS Kooner [PI], JC Chambers)
    • …
    corecore